Overview
Brought to you by YData
Dataset statistics
| Number of variables | 40 |
|---|---|
| Number of observations | 59400 |
| Missing cells | 46743 |
| Missing cells (%) | 2.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 18.1 MiB |
| Average record size in memory | 320.0 B |
Variable types
| Numeric | 10 |
|---|---|
| DateTime | 1 |
| Text | 7 |
| Categorical | 20 |
| Boolean | 2 |
recorded_by has constant value "GeoData Consultants Ltd" | Constant |
basin is highly overall correlated with construction_year and 3 other fields | High correlation |
construction_year is highly overall correlated with basin and 3 other fields | High correlation |
extraction_type is highly overall correlated with extraction_type_class and 3 other fields | High correlation |
extraction_type_class is highly overall correlated with extraction_type and 3 other fields | High correlation |
extraction_type_group is highly overall correlated with extraction_type and 3 other fields | High correlation |
gps_height is highly overall correlated with construction_year and 1 other fields | High correlation |
latitude is highly overall correlated with basin and 1 other fields | High correlation |
longitude is highly overall correlated with basin and 1 other fields | High correlation |
management is highly overall correlated with management_group and 1 other fields | High correlation |
management_group is highly overall correlated with management and 1 other fields | High correlation |
payment is highly overall correlated with payment_type | High correlation |
payment_type is highly overall correlated with payment | High correlation |
population is highly overall correlated with construction_year and 1 other fields | High correlation |
quality_group is highly overall correlated with water_quality | High correlation |
quantity is highly overall correlated with quantity_group | High correlation |
quantity_group is highly overall correlated with quantity | High correlation |
region is highly overall correlated with basin and 4 other fields | High correlation |
region_code is highly overall correlated with region | High correlation |
scheme_management is highly overall correlated with management and 1 other fields | High correlation |
source is highly overall correlated with source_class and 1 other fields | High correlation |
source_class is highly overall correlated with source and 1 other fields | High correlation |
source_type is highly overall correlated with source and 1 other fields | High correlation |
water_quality is highly overall correlated with quality_group | High correlation |
waterpoint_type is highly overall correlated with extraction_type and 3 other fields | High correlation |
waterpoint_type_group is highly overall correlated with extraction_type and 3 other fields | High correlation |
public_meeting is highly imbalanced (56.3%) | Imbalance |
management_group is highly imbalanced (69.3%) | Imbalance |
water_quality is highly imbalanced (71.3%) | Imbalance |
quality_group is highly imbalanced (68.0%) | Imbalance |
funder has 3637 (6.1%) missing values | Missing |
installer has 3655 (6.2%) missing values | Missing |
public_meeting has 3334 (5.6%) missing values | Missing |
scheme_management has 3878 (6.5%) missing values | Missing |
scheme_name has 28810 (48.5%) missing values | Missing |
permit has 3056 (5.1%) missing values | Missing |
amount_tsh is highly skewed (γ1 = 57.80779995) | Skewed |
num_private is highly skewed (γ1 = 91.93374999) | Skewed |
id is uniformly distributed | Uniform |
id has unique values | Unique |
amount_tsh has 41639 (70.1%) zeros | Zeros |
gps_height has 20438 (34.4%) zeros | Zeros |
longitude has 1812 (3.1%) zeros | Zeros |
num_private has 58643 (98.7%) zeros | Zeros |
population has 21381 (36.0%) zeros | Zeros |
construction_year has 20709 (34.9%) zeros | Zeros |
Reproduction
| Analysis started | 2025-04-19 01:29:19.159288 |
|---|---|
| Analysis finished | 2025-04-19 01:29:50.827046 |
| Duration | 31.67 seconds |
| Software version | ydata-profiling vv4.12.2 |
| Download configuration | config.json |
Variables
id
Real number (ℝ)
Uniform  Unique 
| Distinct | 59400 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37115.132 |
| Minimum | 0 |
|---|---|
| Maximum | 74247 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3730.9 |
| Q1 | 18519.75 |
| median | 37061.5 |
| Q3 | 55656.5 |
| 95-th percentile | 70564.05 |
| Maximum | 74247 |
| Range | 74247 |
| Interquartile range (IQR) | 37136.75 |
Descriptive statistics
| Standard deviation | 21453.128 |
|---|---|
| Coefficient of variation (CV) | 0.57801569 |
| Kurtosis | -1.201515 |
| Mean | 37115.132 |
| Median Absolute Deviation (MAD) | 18568.5 |
| Skewness | 0.0026225303 |
| Sum | 2.2046388 × 109 |
| Variance | 4.6023672 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 69572 | 1 | < 0.1% |
| 27851 | 1 | < 0.1% |
| 6924 | 1 | < 0.1% |
| 61097 | 1 | < 0.1% |
| 48517 | 1 | < 0.1% |
| 62700 | 1 | < 0.1% |
| 48914 | 1 | < 0.1% |
| 479 | 1 | < 0.1% |
| 12824 | 1 | < 0.1% |
| 21909 | 1 | < 0.1% |
| Other values (59390) | 59390 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 |
| Value | Count | Frequency (%) |
| 74247 | 1 | |
| 74246 | 1 | |
| 74243 | 1 | |
| 74242 | 1 | |
| 74240 | 1 | |
| 74239 | 1 | |
| 74238 | 1 | |
| 74237 | 1 | |
| 74236 | 1 | |
| 74235 | 1 |
amount_tsh
Real number (ℝ)
Skewed  Zeros 
| Distinct | 98 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 317.65038 |
| Minimum | 0 |
|---|---|
| Maximum | 350000 |
| Zeros | 41639 |
| Zeros (%) | 70.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 20 |
| 95-th percentile | 1200 |
| Maximum | 350000 |
| Range | 350000 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 2997.5746 |
|---|---|
| Coefficient of variation (CV) | 9.43671 |
| Kurtosis | 4903.5431 |
| Mean | 317.65038 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 57.8078 |
| Sum | 18868433 |
| Variance | 8985453.2 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 41639 | |
| 500 | 3102 | 5.2% |
| 50 | 2472 | 4.2% |
| 1000 | 1488 | 2.5% |
| 20 | 1463 | 2.5% |
| 200 | 1220 | 2.1% |
| 100 | 816 | 1.4% |
| 10 | 806 | 1.4% |
| 30 | 743 | 1.3% |
| 2000 | 704 | 1.2% |
| Other values (88) | 4947 | 8.3% |
| Value | Count | Frequency (%) |
| 0 | 41639 | |
| 0.2 | 3 | < 0.1% |
| 0.25 | 1 | < 0.1% |
| 1 | 3 | < 0.1% |
| 2 | 13 | < 0.1% |
| 5 | 376 | 0.6% |
| 6 | 190 | 0.3% |
| 7 | 69 | 0.1% |
| 9 | 1 | < 0.1% |
| 10 | 806 | 1.4% |
| Value | Count | Frequency (%) |
| 350000 | 1 | < 0.1% |
| 250000 | 1 | < 0.1% |
| 200000 | 1 | < 0.1% |
| 170000 | 1 | < 0.1% |
| 138000 | 1 | < 0.1% |
| 120000 | 1 | < 0.1% |
| 117000 | 7 | |
| 100000 | 3 | |
| 70000 | 1 | < 0.1% |
| 60000 | 1 | < 0.1% |
date_recorded
Date
| Distinct | 356 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| Minimum | 2002-10-14 00:00:00 |
|---|---|
| Maximum | 2013-12-03 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
funder
Text
Missing 
| Distinct | 1896 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 3637 |
| Missing (%) | 6.1% |
| Memory size | 464.2 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 27 |
| Mean length | 9.930115 |
| Min length | 1 |
Unique
| Unique | 974 ? |
|---|---|
| Unique (%) | 1.7% |
Sample
| 1st row | Roman |
|---|---|
| 2nd row | Grumeti |
| 3rd row | Lottery Club |
| 4th row | Unicef |
| 5th row | Action In A |
| Value | Count | Frequency (%) |
| of | 9748 | 10.8% |
| government | 9276 | 10.3% |
| tanzania | 9172 | 10.1% |
| danida | 3123 | 3.5% |
| world | 2789 | 3.1% |
| water | 2645 | 2.9% |
| hesawa | 2203 | 2.4% |
| bank | 1416 | 1.6% |
| rwssp | 1376 | 1.5% |
| kkkt | 1370 | 1.5% |
| Other values (2064) | 47252 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 68200 | 12.3% |
| n | 57840 | 10.4% |
| i | 38011 | 6.9% |
| e | 37462 | 6.8% |
| 34673 | 6.3% | |
| r | 27879 | 5.0% |
| t | 23016 | 4.2% |
| o | 22739 | 4.1% |
| s | 17208 | 3.1% |
| d | 15464 | 2.8% |
| Other values (59) | 211241 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 553733 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 68200 | 12.3% |
| n | 57840 | 10.4% |
| i | 38011 | 6.9% |
| e | 37462 | 6.8% |
| 34673 | 6.3% | |
| r | 27879 | 5.0% |
| t | 23016 | 4.2% |
| o | 22739 | 4.1% |
| s | 17208 | 3.1% |
| d | 15464 | 2.8% |
| Other values (59) | 211241 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 553733 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 68200 | 12.3% |
| n | 57840 | 10.4% |
| i | 38011 | 6.9% |
| e | 37462 | 6.8% |
| 34673 | 6.3% | |
| r | 27879 | 5.0% |
| t | 23016 | 4.2% |
| o | 22739 | 4.1% |
| s | 17208 | 3.1% |
| d | 15464 | 2.8% |
| Other values (59) | 211241 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 553733 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 68200 | 12.3% |
| n | 57840 | 10.4% |
| i | 38011 | 6.9% |
| e | 37462 | 6.8% |
| 34673 | 6.3% | |
| r | 27879 | 5.0% |
| t | 23016 | 4.2% |
| o | 22739 | 4.1% |
| s | 17208 | 3.1% |
| d | 15464 | 2.8% |
| Other values (59) | 211241 |
gps_height
Real number (ℝ)
High correlation  Zeros 
| Distinct | 2428 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 668.29724 |
| Minimum | -90 |
|---|---|
| Maximum | 2770 |
| Zeros | 20438 |
| Zeros (%) | 34.4% |
| Negative | 1496 |
| Negative (%) | 2.5% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | -90 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 369 |
| Q3 | 1319.25 |
| 95-th percentile | 1797 |
| Maximum | 2770 |
| Range | 2860 |
| Interquartile range (IQR) | 1319.25 |
Descriptive statistics
| Standard deviation | 693.11635 |
|---|---|
| Coefficient of variation (CV) | 1.0371378 |
| Kurtosis | -1.2924401 |
| Mean | 668.29724 |
| Median Absolute Deviation (MAD) | 369 |
| Skewness | 0.46240208 |
| Sum | 39696856 |
| Variance | 480410.28 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 20438 | |
| -15 | 60 | 0.1% |
| -16 | 55 | 0.1% |
| -13 | 55 | 0.1% |
| 1290 | 52 | 0.1% |
| -20 | 52 | 0.1% |
| -14 | 51 | 0.1% |
| 303 | 51 | 0.1% |
| -18 | 49 | 0.1% |
| -19 | 47 | 0.1% |
| Other values (2418) | 38490 |
| Value | Count | Frequency (%) |
| -90 | 1 | < 0.1% |
| -63 | 2 | < 0.1% |
| -59 | 1 | < 0.1% |
| -57 | 1 | < 0.1% |
| -55 | 1 | < 0.1% |
| -54 | 1 | < 0.1% |
| -53 | 1 | < 0.1% |
| -52 | 2 | < 0.1% |
| -51 | 2 | < 0.1% |
| -50 | 5 |
| Value | Count | Frequency (%) |
| 2770 | 1 | |
| 2628 | 1 | |
| 2627 | 1 | |
| 2626 | 2 | |
| 2623 | 1 | |
| 2614 | 1 | |
| 2585 | 1 | |
| 2576 | 1 | |
| 2569 | 1 | |
| 2568 | 1 |
installer
Text
Missing 
| Distinct | 2145 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 3655 |
| Missing (%) | 6.2% |
| Memory size | 464.2 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 29 |
| Mean length | 6.1112028 |
| Min length | 1 |
Unique
| Unique | 1098 ? |
|---|---|
| Unique (%) | 2.0% |
Sample
| 1st row | Roman |
|---|---|
| 2nd row | GRUMETI |
| 3rd row | World vision |
| 4th row | UNICEF |
| 5th row | Artisan |
| Value | Count | Frequency (%) |
| dwe | 17601 | |
| government | 2778 | 4.1% |
| water | 1881 | 2.8% |
| hesawa | 1395 | 2.0% |
| rwe | 1230 | 1.8% |
| district | 1216 | 1.8% |
| kkkt | 1153 | 1.7% |
| council | 1106 | 1.6% |
| commu | 1065 | 1.6% |
| danida | 1051 | 1.5% |
| Other values (1976) | 37806 |
Most occurring characters
| Value | Count | Frequency (%) |
| D | 27595 | 8.1% |
| W | 25849 | 7.6% |
| E | 25389 | 7.5% |
| a | 17343 | 5.1% |
| n | 16558 | 4.9% |
| e | 15500 | 4.5% |
| i | 15053 | 4.4% |
| A | 13668 | 4.0% |
| r | 13377 | 3.9% |
| t | 12904 | 3.8% |
| Other values (60) | 157433 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 340669 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| D | 27595 | 8.1% |
| W | 25849 | 7.6% |
| E | 25389 | 7.5% |
| a | 17343 | 5.1% |
| n | 16558 | 4.9% |
| e | 15500 | 4.5% |
| i | 15053 | 4.4% |
| A | 13668 | 4.0% |
| r | 13377 | 3.9% |
| t | 12904 | 3.8% |
| Other values (60) | 157433 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 340669 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| D | 27595 | 8.1% |
| W | 25849 | 7.6% |
| E | 25389 | 7.5% |
| a | 17343 | 5.1% |
| n | 16558 | 4.9% |
| e | 15500 | 4.5% |
| i | 15053 | 4.4% |
| A | 13668 | 4.0% |
| r | 13377 | 3.9% |
| t | 12904 | 3.8% |
| Other values (60) | 157433 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 340669 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| D | 27595 | 8.1% |
| W | 25849 | 7.6% |
| E | 25389 | 7.5% |
| a | 17343 | 5.1% |
| n | 16558 | 4.9% |
| e | 15500 | 4.5% |
| i | 15053 | 4.4% |
| A | 13668 | 4.0% |
| r | 13377 | 3.9% |
| t | 12904 | 3.8% |
| Other values (60) | 157433 |
longitude
Real number (ℝ)
High correlation  Zeros 
| Distinct | 57516 |
|---|---|
| Distinct (%) | 96.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34.077427 |
| Minimum | 0 |
|---|---|
| Maximum | 40.345193 |
| Zeros | 1812 |
| Zeros (%) | 3.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 30.04066 |
| Q1 | 33.090347 |
| median | 34.908743 |
| Q3 | 37.178387 |
| 95-th percentile | 39.13324 |
| Maximum | 40.345193 |
| Range | 40.345193 |
| Interquartile range (IQR) | 4.0880392 |
Descriptive statistics
| Standard deviation | 6.5674318 |
|---|---|
| Coefficient of variation (CV) | 0.19272089 |
| Kurtosis | 19.187031 |
| Mean | 34.077427 |
| Median Absolute Deviation (MAD) | 2.0325111 |
| Skewness | -4.1910465 |
| Sum | 2024199.1 |
| Variance | 43.131161 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1812 | 3.1% |
| 37.37571687 | 2 | < 0.1% |
| 38.34050134 | 2 | < 0.1% |
| 39.08618257 | 2 | < 0.1% |
| 33.00503158 | 2 | < 0.1% |
| 39.09178536 | 2 | < 0.1% |
| 32.98751118 | 2 | < 0.1% |
| 37.23632569 | 2 | < 0.1% |
| 39.08628657 | 2 | < 0.1% |
| 39.08596496 | 2 | < 0.1% |
| Other values (57506) | 57570 |
| Value | Count | Frequency (%) |
| 0 | 1812 | |
| 29.6071219 | 1 | < 0.1% |
| 29.60720109 | 1 | < 0.1% |
| 29.61032056 | 1 | < 0.1% |
| 29.61096482 | 1 | < 0.1% |
| 29.61194674 | 1 | < 0.1% |
| 29.61250689 | 1 | < 0.1% |
| 29.61276296 | 1 | < 0.1% |
| 29.61344309 | 1 | < 0.1% |
| 29.6168718 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 40.34519307 | 1 | |
| 40.34430089 | 1 | |
| 40.32523996 | 1 | |
| 40.32522643 | 1 | |
| 40.32340181 | 1 | |
| 40.32283237 | 1 | |
| 40.32280453 | 1 | |
| 40.3226251 | 1 | |
| 40.32216902 | 1 | |
| 40.32196593 | 1 |
latitude
Real number (ℝ)
High correlation 
| Distinct | 57517 |
|---|---|
| Distinct (%) | 96.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -5.7060327 |
| Minimum | -11.64944 |
|---|---|
| Maximum | -2 × 10-8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 59400 |
| Negative (%) | 100.0% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | -11.64944 |
|---|---|
| 5-th percentile | -10.58555 |
| Q1 | -8.5406213 |
| median | -5.0215966 |
| Q3 | -3.3261556 |
| 95-th percentile | -1.4088722 |
| Maximum | -2 × 10-8 |
| Range | 11.64944 |
| Interquartile range (IQR) | 5.2144657 |
Descriptive statistics
| Standard deviation | 2.9460191 |
|---|---|
| Coefficient of variation (CV) | -0.51629902 |
| Kurtosis | -1.0576167 |
| Mean | -5.7060327 |
| Median Absolute Deviation (MAD) | 2.0700299 |
| Skewness | -0.15203657 |
| Sum | -338938.34 |
| Variance | 8.6790284 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -2 × 10-8 | 1812 | 3.1% |
| -6.98584173 | 2 | < 0.1% |
| -6.9802204 | 2 | < 0.1% |
| -2.47667983 | 2 | < 0.1% |
| -6.97826294 | 2 | < 0.1% |
| -7.07808103 | 2 | < 0.1% |
| -2.46524583 | 2 | < 0.1% |
| -2.4943533 | 2 | < 0.1% |
| -7.1772029 | 2 | < 0.1% |
| -2.51532072 | 2 | < 0.1% |
| Other values (57507) | 57570 |
| Value | Count | Frequency (%) |
| -11.64944018 | 1 | |
| -11.64837759 | 1 | |
| -11.58629656 | 1 | |
| -11.56857679 | 1 | |
| -11.56680457 | 1 | |
| -11.56450865 | 1 | |
| -11.56432357 | 1 | |
| -11.56231592 | 1 | |
| -11.56228898 | 1 | |
| -11.56161898 | 1 |
| Value | Count | Frequency (%) |
| -2 × 10-8 | 1812 | |
| -0.99846435 | 1 | < 0.1% |
| -0.998916 | 1 | < 0.1% |
| -0.99901209 | 1 | < 0.1% |
| -0.99911702 | 1 | < 0.1% |
| -0.9994692 | 1 | < 0.1% |
| -0.99950651 | 1 | < 0.1% |
| -0.99952232 | 1 | < 0.1% |
| -1.00058519 | 1 | < 0.1% |
| -1.0015208 | 1 | < 0.1% |
wpt_name
Text
| Distinct | 37399 |
|---|---|
| Distinct (%) | 63.0% |
| Missing | 2 |
| Missing (%) | < 0.1% |
| Memory size | 464.2 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 25 |
| Mean length | 10.962339 |
| Min length | 1 |
Unique
| Unique | 32928 ? |
|---|---|
| Unique (%) | 55.4% |
Sample
| 1st row | none |
|---|---|
| 2nd row | Zahanati |
| 3rd row | Kwa Mahundi |
| 4th row | Zahanati Ya Nanyumbu |
| 5th row | Shuleni |
| Value | Count | Frequency (%) |
| kwa | 21384 | 19.6% |
| none | 3563 | 3.3% |
| mzee | 3385 | 3.1% |
| shuleni | 2123 | 1.9% |
| ya | 1499 | 1.4% |
| shule | 1389 | 1.3% |
| school | 1113 | 1.0% |
| primary | 1052 | 1.0% |
| zahanati | 983 | 0.9% |
| msingi | 870 | 0.8% |
| Other values (29461) | 71931 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 98806 | |
| i | 52404 | 8.0% |
| 49898 | 7.7% | |
| n | 42146 | 6.5% |
| e | 40983 | 6.3% |
| w | 31669 | 4.9% |
| K | 31385 | 4.8% |
| o | 30245 | 4.6% |
| u | 24217 | 3.7% |
| M | 22040 | 3.4% |
| Other values (65) | 227348 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 651141 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 98806 | |
| i | 52404 | 8.0% |
| 49898 | 7.7% | |
| n | 42146 | 6.5% |
| e | 40983 | 6.3% |
| w | 31669 | 4.9% |
| K | 31385 | 4.8% |
| o | 30245 | 4.6% |
| u | 24217 | 3.7% |
| M | 22040 | 3.4% |
| Other values (65) | 227348 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 651141 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 98806 | |
| i | 52404 | 8.0% |
| 49898 | 7.7% | |
| n | 42146 | 6.5% |
| e | 40983 | 6.3% |
| w | 31669 | 4.9% |
| K | 31385 | 4.8% |
| o | 30245 | 4.6% |
| u | 24217 | 3.7% |
| M | 22040 | 3.4% |
| Other values (65) | 227348 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 651141 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 98806 | |
| i | 52404 | 8.0% |
| 49898 | 7.7% | |
| n | 42146 | 6.5% |
| e | 40983 | 6.3% |
| w | 31669 | 4.9% |
| K | 31385 | 4.8% |
| o | 30245 | 4.6% |
| u | 24217 | 3.7% |
| M | 22040 | 3.4% |
| Other values (65) | 227348 |
num_private
Real number (ℝ)
Skewed  Zeros 
| Distinct | 65 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.47414141 |
| Minimum | 0 |
|---|---|
| Maximum | 1776 |
| Zeros | 58643 |
| Zeros (%) | 98.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 1776 |
| Range | 1776 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 12.23623 |
|---|---|
| Coefficient of variation (CV) | 25.807131 |
| Kurtosis | 11137.295 |
| Mean | 0.47414141 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 91.93375 |
| Sum | 28164 |
| Variance | 149.72532 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 58643 | |
| 6 | 81 | 0.1% |
| 1 | 73 | 0.1% |
| 5 | 46 | 0.1% |
| 8 | 46 | 0.1% |
| 32 | 40 | 0.1% |
| 45 | 36 | 0.1% |
| 15 | 35 | 0.1% |
| 39 | 30 | 0.1% |
| 93 | 28 | < 0.1% |
| Other values (55) | 342 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 58643 | |
| 1 | 73 | 0.1% |
| 2 | 23 | < 0.1% |
| 3 | 27 | < 0.1% |
| 4 | 20 | < 0.1% |
| 5 | 46 | 0.1% |
| 6 | 81 | 0.1% |
| 7 | 26 | < 0.1% |
| 8 | 46 | 0.1% |
| 9 | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| 1776 | 1 | |
| 1402 | 1 | |
| 755 | 1 | |
| 698 | 1 | |
| 672 | 1 | |
| 668 | 1 | |
| 450 | 1 | |
| 300 | 1 | |
| 280 | 1 | |
| 240 | 1 |
basin
Categorical
High correlation 
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| Lake Victoria | |
|---|---|
| Pangani | |
| Rufiji | |
| Internal | |
| Lake Tanganyika | |
| Other values (4) |
Length
| Max length | 23 |
|---|---|
| Median length | 11 |
| Mean length | 10.892357 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Lake Nyasa |
|---|---|
| 2nd row | Lake Victoria |
| 3rd row | Pangani |
| 4th row | Ruvuma / Southern Coast |
| 5th row | Lake Victoria |
Common Values
| Value | Count | Frequency (%) |
| Lake Victoria | 10248 | |
| Pangani | 8940 | |
| Rufiji | 7976 | |
| Internal | 7785 | |
| Lake Tanganyika | 6432 | |
| Wami / Ruvu | 5987 | |
| Lake Nyasa | 5085 | |
| Ruvuma / Southern Coast | 4493 | |
| Lake Rukwa | 2454 | 4.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| lake | 24219 | |
| 10480 | ||
| victoria | 10248 | |
| pangani | 8940 | 8.2% |
| rufiji | 7976 | 7.3% |
| internal | 7785 | 7.1% |
| tanganyika | 6432 | 5.9% |
| wami | 5987 | 5.5% |
| ruvu | 5987 | 5.5% |
| nyasa | 5085 | 4.7% |
| Other values (4) | 15933 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 107025 | |
| i | 57807 | 8.9% |
| n | 50807 | 7.9% |
| 49672 | 7.7% | |
| e | 36497 | 5.6% |
| u | 35883 | 5.5% |
| k | 33105 | 5.1% |
| t | 27019 | 4.2% |
| L | 24219 | 3.7% |
| r | 22526 | 3.5% |
| Other values (22) | 202446 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 647006 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 107025 | |
| i | 57807 | 8.9% |
| n | 50807 | 7.9% |
| 49672 | 7.7% | |
| e | 36497 | 5.6% |
| u | 35883 | 5.5% |
| k | 33105 | 5.1% |
| t | 27019 | 4.2% |
| L | 24219 | 3.7% |
| r | 22526 | 3.5% |
| Other values (22) | 202446 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 647006 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 107025 | |
| i | 57807 | 8.9% |
| n | 50807 | 7.9% |
| 49672 | 7.7% | |
| e | 36497 | 5.6% |
| u | 35883 | 5.5% |
| k | 33105 | 5.1% |
| t | 27019 | 4.2% |
| L | 24219 | 3.7% |
| r | 22526 | 3.5% |
| Other values (22) | 202446 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 647006 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 107025 | |
| i | 57807 | 8.9% |
| n | 50807 | 7.9% |
| 49672 | 7.7% | |
| e | 36497 | 5.6% |
| u | 35883 | 5.5% |
| k | 33105 | 5.1% |
| t | 27019 | 4.2% |
| L | 24219 | 3.7% |
| r | 22526 | 3.5% |
| Other values (22) | 202446 |
subvillage
Text
| Distinct | 19287 |
|---|---|
| Distinct (%) | 32.7% |
| Missing | 371 |
| Missing (%) | 0.6% |
| Memory size | 464.2 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 27 |
| Mean length | 7.8975927 |
| Min length | 1 |
Unique
| Unique | 9424 ? |
|---|---|
| Unique (%) | 16.0% |
Sample
| 1st row | Mnyusi B |
|---|---|
| 2nd row | Nyamara |
| 3rd row | Majengo |
| 4th row | Mahakamani |
| 5th row | Kyanyamisa |
| Value | Count | Frequency (%) |
| a | 2387 | 3.4% |
| b | 2043 | 2.9% |
| kati | 1902 | 2.7% |
| majengo | 610 | 0.9% |
| wa | 600 | 0.8% |
| shuleni | 593 | 0.8% |
| madukani | 569 | 0.8% |
| mtaa | 514 | 0.7% |
| juu | 403 | 0.6% |
| mjini | 378 | 0.5% |
| Other values (17024) | 60795 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 72003 | |
| i | 45666 | 9.8% |
| n | 33499 | 7.2% |
| u | 26424 | 5.7% |
| e | 25671 | 5.5% |
| o | 23556 | 5.1% |
| M | 20431 | 4.4% |
| g | 18951 | 4.1% |
| l | 16372 | 3.5% |
| m | 15053 | 3.2% |
| Other values (63) | 168561 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 466187 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 72003 | |
| i | 45666 | 9.8% |
| n | 33499 | 7.2% |
| u | 26424 | 5.7% |
| e | 25671 | 5.5% |
| o | 23556 | 5.1% |
| M | 20431 | 4.4% |
| g | 18951 | 4.1% |
| l | 16372 | 3.5% |
| m | 15053 | 3.2% |
| Other values (63) | 168561 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 466187 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 72003 | |
| i | 45666 | 9.8% |
| n | 33499 | 7.2% |
| u | 26424 | 5.7% |
| e | 25671 | 5.5% |
| o | 23556 | 5.1% |
| M | 20431 | 4.4% |
| g | 18951 | 4.1% |
| l | 16372 | 3.5% |
| m | 15053 | 3.2% |
| Other values (63) | 168561 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 466187 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 72003 | |
| i | 45666 | 9.8% |
| n | 33499 | 7.2% |
| u | 26424 | 5.7% |
| e | 25671 | 5.5% |
| o | 23556 | 5.1% |
| M | 20431 | 4.4% |
| g | 18951 | 4.1% |
| l | 16372 | 3.5% |
| m | 15053 | 3.2% |
| Other values (63) | 168561 |
region
Categorical
High correlation 
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| Iringa | |
|---|---|
| Shinyanga | |
| Mbeya | |
| Kilimanjaro | |
| Morogoro | |
| Other values (16) |
Length
| Max length | 13 |
|---|---|
| Median length | 11 |
| Mean length | 6.6237542 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Iringa |
|---|---|
| 2nd row | Mara |
| 3rd row | Manyara |
| 4th row | Mtwara |
| 5th row | Kagera |
Common Values
| Value | Count | Frequency (%) |
| Iringa | 5294 | 8.9% |
| Shinyanga | 4982 | 8.4% |
| Mbeya | 4639 | 7.8% |
| Kilimanjaro | 4379 | 7.4% |
| Morogoro | 4006 | 6.7% |
| Arusha | 3350 | 5.6% |
| Kagera | 3316 | 5.6% |
| Mwanza | 3102 | 5.2% |
| Kigoma | 2816 | 4.7% |
| Ruvuma | 2640 | 4.4% |
| Other values (11) | 20876 |
Length
| Value | Count | Frequency (%) |
| iringa | 5294 | 8.7% |
| shinyanga | 4982 | 8.2% |
| mbeya | 4639 | 7.6% |
| kilimanjaro | 4379 | 7.2% |
| morogoro | 4006 | 6.6% |
| arusha | 3350 | 5.5% |
| kagera | 3316 | 5.4% |
| mwanza | 3102 | 5.1% |
| kigoma | 2816 | 4.6% |
| ruvuma | 2640 | 4.3% |
| Other values (13) | 22486 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 83413 | |
| n | 33143 | 8.4% |
| r | 32397 | 8.2% |
| i | 31763 | 8.1% |
| o | 29580 | 7.5% |
| g | 25054 | 6.4% |
| M | 17029 | 4.3% |
| m | 12841 | 3.3% |
| y | 11204 | 2.8% |
| K | 10511 | 2.7% |
| Other values (22) | 106516 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 393451 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 83413 | |
| n | 33143 | 8.4% |
| r | 32397 | 8.2% |
| i | 31763 | 8.1% |
| o | 29580 | 7.5% |
| g | 25054 | 6.4% |
| M | 17029 | 4.3% |
| m | 12841 | 3.3% |
| y | 11204 | 2.8% |
| K | 10511 | 2.7% |
| Other values (22) | 106516 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 393451 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 83413 | |
| n | 33143 | 8.4% |
| r | 32397 | 8.2% |
| i | 31763 | 8.1% |
| o | 29580 | 7.5% |
| g | 25054 | 6.4% |
| M | 17029 | 4.3% |
| m | 12841 | 3.3% |
| y | 11204 | 2.8% |
| K | 10511 | 2.7% |
| Other values (22) | 106516 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 393451 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 83413 | |
| n | 33143 | 8.4% |
| r | 32397 | 8.2% |
| i | 31763 | 8.1% |
| o | 29580 | 7.5% |
| g | 25054 | 6.4% |
| M | 17029 | 4.3% |
| m | 12841 | 3.3% |
| y | 11204 | 2.8% |
| K | 10511 | 2.7% |
| Other values (22) | 106516 |
region_code
Real number (ℝ)
High correlation 
| Distinct | 27 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.297003 |
| Minimum | 1 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 12 |
| Q3 | 17 |
| 95-th percentile | 60 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 17.587406 |
|---|---|
| Coefficient of variation (CV) | 1.1497289 |
| Kurtosis | 10.288433 |
| Mean | 15.297003 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 3.1738181 |
| Sum | 908642 |
| Variance | 309.31686 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 5300 | 8.9% |
| 17 | 5011 | 8.4% |
| 12 | 4639 | 7.8% |
| 3 | 4379 | 7.4% |
| 5 | 4040 | 6.8% |
| 18 | 3324 | 5.6% |
| 19 | 3047 | 5.1% |
| 2 | 3024 | 5.1% |
| 16 | 2816 | 4.7% |
| 10 | 2640 | 4.4% |
| Other values (17) | 21180 |
| Value | Count | Frequency (%) |
| 1 | 2201 | |
| 2 | 3024 | |
| 3 | 4379 | |
| 4 | 2513 | |
| 5 | 4040 | |
| 6 | 1609 | 2.7% |
| 7 | 805 | 1.4% |
| 8 | 300 | 0.5% |
| 9 | 390 | 0.7% |
| 10 | 2640 |
| Value | Count | Frequency (%) |
| 99 | 423 | 0.7% |
| 90 | 917 | 1.5% |
| 80 | 1238 | 2.1% |
| 60 | 1025 | 1.7% |
| 40 | 1 | < 0.1% |
| 24 | 326 | 0.5% |
| 21 | 1583 | |
| 20 | 1969 | |
| 19 | 3047 | |
| 18 | 3324 |
district_code
Real number (ℝ)
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.6297475 |
| Minimum | 0 |
|---|---|
| Maximum | 80 |
| Zeros | 23 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 30 |
| Maximum | 80 |
| Range | 80 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 9.6336486 |
|---|---|
| Coefficient of variation (CV) | 1.7112044 |
| Kurtosis | 16.214284 |
| Mean | 5.6297475 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 3.9620453 |
| Sum | 334407 |
| Variance | 92.807186 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 12203 | |
| 2 | 11173 | |
| 3 | 9998 | |
| 4 | 8999 | |
| 5 | 4356 | 7.3% |
| 6 | 4074 | 6.9% |
| 7 | 3343 | 5.6% |
| 8 | 1043 | 1.8% |
| 30 | 995 | 1.7% |
| 33 | 874 | 1.5% |
| Other values (10) | 2342 | 3.9% |
| Value | Count | Frequency (%) |
| 0 | 23 | < 0.1% |
| 1 | 12203 | |
| 2 | 11173 | |
| 3 | 9998 | |
| 4 | 8999 | |
| 5 | 4356 | 7.3% |
| 6 | 4074 | 6.9% |
| 7 | 3343 | 5.6% |
| 8 | 1043 | 1.8% |
| 13 | 391 | 0.7% |
| Value | Count | Frequency (%) |
| 80 | 12 | < 0.1% |
| 67 | 6 | < 0.1% |
| 63 | 195 | 0.3% |
| 62 | 109 | 0.2% |
| 60 | 63 | 0.1% |
| 53 | 745 | |
| 43 | 505 | |
| 33 | 874 | |
| 30 | 995 | |
| 23 | 293 | 0.5% |
lga
Text
| Distinct | 125 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
Length
| Max length | 16 |
|---|---|
| Median length | 14 |
| Mean length | 7.4168855 |
| Min length | 3 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Ludewa |
|---|---|
| 2nd row | Serengeti |
| 3rd row | Simanjiro |
| 4th row | Nanyumbu |
| 5th row | Karagwe |
| Value | Count | Frequency (%) |
| rural | 9552 | 13.5% |
| njombe | 2503 | 3.5% |
| urban | 1683 | 2.4% |
| moshi | 1330 | 1.9% |
| arusha | 1315 | 1.9% |
| bariadi | 1177 | 1.7% |
| singida | 1172 | 1.7% |
| rungwe | 1106 | 1.6% |
| kilosa | 1094 | 1.5% |
| kasulu | 1047 | 1.5% |
| Other values (106) | 48656 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 69982 | |
| o | 30079 | 6.8% |
| i | 29483 | 6.7% |
| u | 28324 | 6.4% |
| r | 26886 | 6.1% |
| e | 22579 | 5.1% |
| n | 22521 | 5.1% |
| l | 19238 | 4.4% |
| g | 18385 | 4.2% |
| M | 16017 | 3.6% |
| Other values (31) | 157069 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 440563 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 69982 | |
| o | 30079 | 6.8% |
| i | 29483 | 6.7% |
| u | 28324 | 6.4% |
| r | 26886 | 6.1% |
| e | 22579 | 5.1% |
| n | 22521 | 5.1% |
| l | 19238 | 4.4% |
| g | 18385 | 4.2% |
| M | 16017 | 3.6% |
| Other values (31) | 157069 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 440563 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 69982 | |
| o | 30079 | 6.8% |
| i | 29483 | 6.7% |
| u | 28324 | 6.4% |
| r | 26886 | 6.1% |
| e | 22579 | 5.1% |
| n | 22521 | 5.1% |
| l | 19238 | 4.4% |
| g | 18385 | 4.2% |
| M | 16017 | 3.6% |
| Other values (31) | 157069 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 440563 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 69982 | |
| o | 30079 | 6.8% |
| i | 29483 | 6.7% |
| u | 28324 | 6.4% |
| r | 26886 | 6.1% |
| e | 22579 | 5.1% |
| n | 22521 | 5.1% |
| l | 19238 | 4.4% |
| g | 18385 | 4.2% |
| M | 16017 | 3.6% |
| Other values (31) | 157069 |
ward
Text
| Distinct | 2092 |
|---|---|
| Distinct (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
Length
| Max length | 23 |
|---|---|
| Median length | 19 |
| Mean length | 7.5058418 |
| Min length | 3 |
Unique
| Unique | 30 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Mundindi |
|---|---|
| 2nd row | Natta |
| 3rd row | Ngorika |
| 4th row | Nanyumbu |
| 5th row | Nyakasimbi |
| Value | Count | Frequency (%) |
| mashariki | 580 | 0.9% |
| urban | 540 | 0.8% |
| siha | 434 | 0.7% |
| kusini | 393 | 0.6% |
| magharibi | 362 | 0.6% |
| igosi | 307 | 0.5% |
| masama | 303 | 0.5% |
| machame | 293 | 0.5% |
| kati | 270 | 0.4% |
| imalinyi | 252 | 0.4% |
| Other values (2106) | 61033 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 69533 | |
| i | 40243 | 9.0% |
| n | 29584 | 6.6% |
| u | 27015 | 6.1% |
| o | 26093 | 5.9% |
| e | 23589 | 5.3% |
| g | 21166 | 4.7% |
| M | 18916 | 4.2% |
| m | 16216 | 3.6% |
| l | 15799 | 3.5% |
| Other values (44) | 157693 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 445847 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 69533 | |
| i | 40243 | 9.0% |
| n | 29584 | 6.6% |
| u | 27015 | 6.1% |
| o | 26093 | 5.9% |
| e | 23589 | 5.3% |
| g | 21166 | 4.7% |
| M | 18916 | 4.2% |
| m | 16216 | 3.6% |
| l | 15799 | 3.5% |
| Other values (44) | 157693 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 445847 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 69533 | |
| i | 40243 | 9.0% |
| n | 29584 | 6.6% |
| u | 27015 | 6.1% |
| o | 26093 | 5.9% |
| e | 23589 | 5.3% |
| g | 21166 | 4.7% |
| M | 18916 | 4.2% |
| m | 16216 | 3.6% |
| l | 15799 | 3.5% |
| Other values (44) | 157693 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 445847 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 69533 | |
| i | 40243 | 9.0% |
| n | 29584 | 6.6% |
| u | 27015 | 6.1% |
| o | 26093 | 5.9% |
| e | 23589 | 5.3% |
| g | 21166 | 4.7% |
| M | 18916 | 4.2% |
| m | 16216 | 3.6% |
| l | 15799 | 3.5% |
| Other values (44) | 157693 |
population
Real number (ℝ)
High correlation  Zeros 
| Distinct | 1049 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 179.90998 |
| Minimum | 0 |
|---|---|
| Maximum | 30500 |
| Zeros | 21381 |
| Zeros (%) | 36.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 25 |
| Q3 | 215 |
| 95-th percentile | 680 |
| Maximum | 30500 |
| Range | 30500 |
| Interquartile range (IQR) | 215 |
Descriptive statistics
| Standard deviation | 471.48218 |
|---|---|
| Coefficient of variation (CV) | 2.620656 |
| Kurtosis | 402.28012 |
| Mean | 179.90998 |
| Median Absolute Deviation (MAD) | 25 |
| Skewness | 12.660714 |
| Sum | 10686653 |
| Variance | 222295.44 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 21381 | |
| 1 | 7025 | 11.8% |
| 200 | 1940 | 3.3% |
| 150 | 1892 | 3.2% |
| 250 | 1681 | 2.8% |
| 300 | 1476 | 2.5% |
| 100 | 1146 | 1.9% |
| 50 | 1139 | 1.9% |
| 500 | 1009 | 1.7% |
| 350 | 986 | 1.7% |
| Other values (1039) | 19725 |
| Value | Count | Frequency (%) |
| 0 | 21381 | |
| 1 | 7025 | 11.8% |
| 2 | 4 | < 0.1% |
| 3 | 4 | < 0.1% |
| 4 | 13 | < 0.1% |
| 5 | 44 | 0.1% |
| 6 | 19 | < 0.1% |
| 7 | 3 | < 0.1% |
| 8 | 23 | < 0.1% |
| 9 | 11 | < 0.1% |
| Value | Count | Frequency (%) |
| 30500 | 1 | < 0.1% |
| 15300 | 1 | < 0.1% |
| 11463 | 1 | < 0.1% |
| 10000 | 3 | |
| 9865 | 1 | < 0.1% |
| 9500 | 1 | < 0.1% |
| 9000 | 3 | |
| 8848 | 1 | < 0.1% |
| 8600 | 1 | < 0.1% |
| 8500 | 1 | < 0.1% |
public_meeting
Boolean
Imbalance  Missing 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3334 |
| Missing (%) | 5.6% |
| Memory size | 464.2 KiB |
| True | |
|---|---|
| False | 5055 |
| (Missing) | 3334 |
| Value | Count | Frequency (%) |
| True | 51011 | |
| False | 5055 | 8.5% |
| (Missing) | 3334 | 5.6% |
recorded_by
Categorical
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| GeoData Consultants Ltd |
|---|
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | GeoData Consultants Ltd |
|---|---|
| 2nd row | GeoData Consultants Ltd |
| 3rd row | GeoData Consultants Ltd |
| 4th row | GeoData Consultants Ltd |
| 5th row | GeoData Consultants Ltd |
Common Values
| Value | Count | Frequency (%) |
| GeoData Consultants Ltd | 59400 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| geodata | 59400 | |
| consultants | 59400 | |
| ltd | 59400 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 237600 | |
| a | 178200 | |
| o | 118800 | |
| 118800 | ||
| n | 118800 | |
| s | 118800 | |
| G | 59400 | 4.3% |
| e | 59400 | 4.3% |
| D | 59400 | 4.3% |
| C | 59400 | 4.3% |
| Other values (4) | 237600 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1366200 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 237600 | |
| a | 178200 | |
| o | 118800 | |
| 118800 | ||
| n | 118800 | |
| s | 118800 | |
| G | 59400 | 4.3% |
| e | 59400 | 4.3% |
| D | 59400 | 4.3% |
| C | 59400 | 4.3% |
| Other values (4) | 237600 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1366200 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 237600 | |
| a | 178200 | |
| o | 118800 | |
| 118800 | ||
| n | 118800 | |
| s | 118800 | |
| G | 59400 | 4.3% |
| e | 59400 | 4.3% |
| D | 59400 | 4.3% |
| C | 59400 | 4.3% |
| Other values (4) | 237600 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1366200 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 237600 | |
| a | 178200 | |
| o | 118800 | |
| 118800 | ||
| n | 118800 | |
| s | 118800 | |
| G | 59400 | 4.3% |
| e | 59400 | 4.3% |
| D | 59400 | 4.3% |
| C | 59400 | 4.3% |
| Other values (4) | 237600 |
scheme_management
Categorical
High correlation  Missing 
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3878 |
| Missing (%) | 6.5% |
| Memory size | 464.2 KiB |
| VWC | |
|---|---|
| WUG | |
| Water authority | 3153 |
| WUA | 2883 |
| Water Board | 2748 |
| Other values (6) |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.6447354 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | VWC |
|---|---|
| 2nd row | Other |
| 3rd row | VWC |
| 4th row | VWC |
| 5th row | VWC |
Common Values
| Value | Count | Frequency (%) |
| VWC | 36793 | |
| WUG | 5206 | 8.8% |
| Water authority | 3153 | 5.3% |
| WUA | 2883 | 4.9% |
| Water Board | 2748 | 4.6% |
| Parastatal | 1680 | 2.8% |
| Private operator | 1063 | 1.8% |
| Company | 1061 | 1.8% |
| Other | 766 | 1.3% |
| SWC | 97 | 0.2% |
| (Missing) | 3878 | 6.5% |
Length
| Value | Count | Frequency (%) |
| vwc | 36793 | |
| water | 5901 | 9.4% |
| wug | 5206 | 8.3% |
| authority | 3153 | 5.0% |
| wua | 2883 | 4.6% |
| board | 2748 | 4.4% |
| parastatal | 1680 | 2.7% |
| private | 1063 | 1.7% |
| operator | 1063 | 1.7% |
| company | 1061 | 1.7% |
| Other values (3) | 935 | 1.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| W | 50880 | |
| C | 37951 | |
| V | 36793 | |
| a | 21709 | |
| t | 18531 | 7.2% |
| r | 17509 | 6.8% |
| o | 9088 | 3.5% |
| e | 8793 | 3.4% |
| U | 8089 | 3.1% |
| 6964 | 2.7% | |
| Other values (18) | 41578 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 257885 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| W | 50880 | |
| C | 37951 | |
| V | 36793 | |
| a | 21709 | |
| t | 18531 | 7.2% |
| r | 17509 | 6.8% |
| o | 9088 | 3.5% |
| e | 8793 | 3.4% |
| U | 8089 | 3.1% |
| 6964 | 2.7% | |
| Other values (18) | 41578 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 257885 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| W | 50880 | |
| C | 37951 | |
| V | 36793 | |
| a | 21709 | |
| t | 18531 | 7.2% |
| r | 17509 | 6.8% |
| o | 9088 | 3.5% |
| e | 8793 | 3.4% |
| U | 8089 | 3.1% |
| 6964 | 2.7% | |
| Other values (18) | 41578 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 257885 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| W | 50880 | |
| C | 37951 | |
| V | 36793 | |
| a | 21709 | |
| t | 18531 | 7.2% |
| r | 17509 | 6.8% |
| o | 9088 | 3.5% |
| e | 8793 | 3.4% |
| U | 8089 | 3.1% |
| 6964 | 2.7% | |
| Other values (18) | 41578 |
scheme_name
Text
Missing 
| Distinct | 2695 |
|---|---|
| Distinct (%) | 8.8% |
| Missing | 28810 |
| Missing (%) | 48.5% |
| Memory size | 464.2 KiB |
Length
| Max length | 46 |
|---|---|
| Median length | 37 |
| Mean length | 14.522164 |
| Min length | 1 |
Unique
| Unique | 712 ? |
|---|---|
| Unique (%) | 2.3% |
Sample
| 1st row | Roman |
|---|---|
| 2nd row | Nyumba ya mungu pipe scheme |
| 3rd row | Zingibali |
| 4th row | BL Bondeni |
| 5th row | wanging'ombe water supply s |
| Value | Count | Frequency (%) |
| water | 9770 | 13.7% |
| supply | 6745 | 9.5% |
| scheme | 2532 | 3.5% |
| wa | 2157 | 3.0% |
| gravity | 1914 | 2.7% |
| pipe | 1346 | 1.9% |
| maji | 1343 | 1.9% |
| mradi | 1097 | 1.5% |
| line | 1016 | 1.4% |
| supplied | 877 | 1.2% |
| Other values (2506) | 42575 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 48584 | 10.9% |
| 41252 | 9.3% | |
| e | 34595 | 7.8% |
| i | 26411 | 5.9% |
| p | 22451 | 5.1% |
| r | 21816 | 4.9% |
| t | 19216 | 4.3% |
| u | 18441 | 4.2% |
| l | 17308 | 3.9% |
| n | 17116 | 3.9% |
| Other values (58) | 177043 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 444233 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 48584 | 10.9% |
| 41252 | 9.3% | |
| e | 34595 | 7.8% |
| i | 26411 | 5.9% |
| p | 22451 | 5.1% |
| r | 21816 | 4.9% |
| t | 19216 | 4.3% |
| u | 18441 | 4.2% |
| l | 17308 | 3.9% |
| n | 17116 | 3.9% |
| Other values (58) | 177043 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 444233 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 48584 | 10.9% |
| 41252 | 9.3% | |
| e | 34595 | 7.8% |
| i | 26411 | 5.9% |
| p | 22451 | 5.1% |
| r | 21816 | 4.9% |
| t | 19216 | 4.3% |
| u | 18441 | 4.2% |
| l | 17308 | 3.9% |
| n | 17116 | 3.9% |
| Other values (58) | 177043 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 444233 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 48584 | 10.9% |
| 41252 | 9.3% | |
| e | 34595 | 7.8% |
| i | 26411 | 5.9% |
| p | 22451 | 5.1% |
| r | 21816 | 4.9% |
| t | 19216 | 4.3% |
| u | 18441 | 4.2% |
| l | 17308 | 3.9% |
| n | 17116 | 3.9% |
| Other values (58) | 177043 |
permit
Boolean
Missing 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3056 |
| Missing (%) | 5.1% |
| Memory size | 464.2 KiB |
| True | |
|---|---|
| False | |
| (Missing) | 3056 |
| Value | Count | Frequency (%) |
| True | 38852 | |
| False | 17492 | |
| (Missing) | 3056 | 5.1% |
construction_year
Real number (ℝ)
High correlation  Zeros 
| Distinct | 55 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1300.6525 |
| Minimum | 0 |
|---|---|
| Maximum | 2013 |
| Zeros | 20709 |
| Zeros (%) | 34.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 464.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1986 |
| Q3 | 2004 |
| 95-th percentile | 2010 |
| Maximum | 2013 |
| Range | 2013 |
| Interquartile range (IQR) | 2004 |
Descriptive statistics
| Standard deviation | 951.62055 |
|---|---|
| Coefficient of variation (CV) | 0.73164859 |
| Kurtosis | -1.5964324 |
| Mean | 1300.6525 |
| Median Absolute Deviation (MAD) | 22 |
| Skewness | -0.63492779 |
| Sum | 77258757 |
| Variance | 905581.67 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 20709 | |
| 2010 | 2645 | 4.5% |
| 2008 | 2613 | 4.4% |
| 2009 | 2533 | 4.3% |
| 2000 | 2091 | 3.5% |
| 2007 | 1587 | 2.7% |
| 2006 | 1471 | 2.5% |
| 2003 | 1286 | 2.2% |
| 2011 | 1256 | 2.1% |
| 2004 | 1123 | 1.9% |
| Other values (45) | 22086 |
| Value | Count | Frequency (%) |
| 0 | 20709 | |
| 1960 | 102 | 0.2% |
| 1961 | 21 | < 0.1% |
| 1962 | 30 | 0.1% |
| 1963 | 85 | 0.1% |
| 1964 | 40 | 0.1% |
| 1965 | 19 | < 0.1% |
| 1966 | 17 | < 0.1% |
| 1967 | 88 | 0.1% |
| 1968 | 77 | 0.1% |
| Value | Count | Frequency (%) |
| 2013 | 176 | 0.3% |
| 2012 | 1084 | |
| 2011 | 1256 | |
| 2010 | 2645 | |
| 2009 | 2533 | |
| 2008 | 2613 | |
| 2007 | 1587 | |
| 2006 | 1471 | |
| 2005 | 1011 | 1.7% |
| 2004 | 1123 |
extraction_type
Categorical
High correlation 
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| gravity | |
|---|---|
| nira/tanira | |
| other | |
| submersible | |
| swn 80 | |
| Other values (13) |
Length
| Max length | 25 |
|---|---|
| Median length | 17 |
| Mean length | 7.7195118 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gravity |
|---|---|
| 2nd row | gravity |
| 3rd row | gravity |
| 4th row | submersible |
| 5th row | gravity |
Common Values
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| nira/tanira | 8154 | 13.7% |
| other | 6430 | 10.8% |
| submersible | 4764 | 8.0% |
| swn 80 | 3670 | 6.2% |
| mono | 2865 | 4.8% |
| india mark ii | 2400 | 4.0% |
| afridev | 1770 | 3.0% |
| ksb | 1415 | 2.4% |
| other - rope pump | 451 | 0.8% |
| Other values (8) | 701 | 1.2% |
Length
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| nira/tanira | 8154 | 11.6% |
| other | 7197 | 10.2% |
| submersible | 4764 | 6.8% |
| swn | 3899 | 5.5% |
| 80 | 3670 | 5.2% |
| mono | 2865 | 4.1% |
| india | 2498 | 3.6% |
| mark | 2498 | 3.6% |
| ii | 2400 | 3.4% |
| Other values (13) | 5640 | 8.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 60078 | |
| r | 59768 | |
| a | 58179 | |
| t | 42131 | |
| v | 28550 | 6.2% |
| y | 26867 | 5.9% |
| g | 26782 | 5.8% |
| n | 25691 | 5.6% |
| e | 19036 | 4.2% |
| s | 14844 | 3.2% |
| Other values (19) | 96613 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 458539 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 60078 | |
| r | 59768 | |
| a | 58179 | |
| t | 42131 | |
| v | 28550 | 6.2% |
| y | 26867 | 5.9% |
| g | 26782 | 5.8% |
| n | 25691 | 5.6% |
| e | 19036 | 4.2% |
| s | 14844 | 3.2% |
| Other values (19) | 96613 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 458539 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 60078 | |
| r | 59768 | |
| a | 58179 | |
| t | 42131 | |
| v | 28550 | 6.2% |
| y | 26867 | 5.9% |
| g | 26782 | 5.8% |
| n | 25691 | 5.6% |
| e | 19036 | 4.2% |
| s | 14844 | 3.2% |
| Other values (19) | 96613 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 458539 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 60078 | |
| r | 59768 | |
| a | 58179 | |
| t | 42131 | |
| v | 28550 | 6.2% |
| y | 26867 | 5.9% |
| g | 26782 | 5.8% |
| n | 25691 | 5.6% |
| e | 19036 | 4.2% |
| s | 14844 | 3.2% |
| Other values (19) | 96613 |
extraction_type_group
Categorical
High correlation 
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| gravity | |
|---|---|
| nira/tanira | |
| other | |
| submersible | |
| swn 80 | |
| Other values (8) |
Length
| Max length | 15 |
|---|---|
| Median length | 14 |
| Mean length | 7.8805387 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gravity |
|---|---|
| 2nd row | gravity |
| 3rd row | gravity |
| 4th row | submersible |
| 5th row | gravity |
Common Values
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| nira/tanira | 8154 | 13.7% |
| other | 6430 | 10.8% |
| submersible | 6179 | 10.4% |
| swn 80 | 3670 | 6.2% |
| mono | 2865 | 4.8% |
| india mark ii | 2400 | 4.0% |
| afridev | 1770 | 3.0% |
| rope pump | 451 | 0.8% |
| other handpump | 364 | 0.6% |
| Other values (3) | 337 | 0.6% |
Length
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| nira/tanira | 8154 | 11.8% |
| other | 6916 | 10.0% |
| submersible | 6179 | 9.0% |
| swn | 3670 | 5.3% |
| 80 | 3670 | 5.3% |
| mono | 2865 | 4.2% |
| mark | 2498 | 3.6% |
| india | 2498 | 3.6% |
| ii | 2400 | 3.5% |
| Other values (7) | 3373 | 4.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 61244 | |
| r | 61141 | |
| a | 58372 | |
| t | 41972 | |
| v | 28550 | 6.1% |
| g | 26780 | 5.7% |
| y | 26780 | 5.7% |
| n | 25822 | 5.5% |
| e | 21729 | 4.6% |
| s | 16028 | 3.4% |
| Other values (16) | 99686 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 468104 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 61244 | |
| r | 61141 | |
| a | 58372 | |
| t | 41972 | |
| v | 28550 | 6.1% |
| g | 26780 | 5.7% |
| y | 26780 | 5.7% |
| n | 25822 | 5.5% |
| e | 21729 | 4.6% |
| s | 16028 | 3.4% |
| Other values (16) | 99686 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 468104 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 61244 | |
| r | 61141 | |
| a | 58372 | |
| t | 41972 | |
| v | 28550 | 6.1% |
| g | 26780 | 5.7% |
| y | 26780 | 5.7% |
| n | 25822 | 5.5% |
| e | 21729 | 4.6% |
| s | 16028 | 3.4% |
| Other values (16) | 99686 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 468104 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 61244 | |
| r | 61141 | |
| a | 58372 | |
| t | 41972 | |
| v | 28550 | 6.1% |
| g | 26780 | 5.7% |
| y | 26780 | 5.7% |
| n | 25822 | 5.5% |
| e | 21729 | 4.6% |
| s | 16028 | 3.4% |
| Other values (16) | 99686 |
extraction_type_class
Categorical
High correlation 
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| gravity | |
|---|---|
| handpump | |
| other | |
| submersible | |
| motorpump | |
| Other values (2) | 568 |
Length
| Max length | 12 |
|---|---|
| Median length | 11 |
| Mean length | 7.6022391 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gravity |
|---|---|
| 2nd row | gravity |
| 3rd row | gravity |
| 4th row | submersible |
| 5th row | gravity |
Common Values
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| handpump | 16456 | |
| other | 6430 | 10.8% |
| submersible | 6179 | 10.4% |
| motorpump | 2987 | 5.0% |
| rope pump | 451 | 0.8% |
| wind-powered | 117 | 0.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| gravity | 26780 | |
| handpump | 16456 | |
| other | 6430 | 10.7% |
| submersible | 6179 | 10.3% |
| motorpump | 2987 | 5.0% |
| rope | 451 | 0.8% |
| pump | 451 | 0.8% |
| wind-powered | 117 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 43236 | 9.6% |
| r | 42944 | 9.5% |
| p | 40356 | 8.9% |
| t | 36197 | 8.0% |
| i | 33076 | 7.3% |
| m | 29060 | 6.4% |
| g | 26780 | 5.9% |
| y | 26780 | 5.9% |
| v | 26780 | 5.9% |
| u | 26073 | 5.8% |
| Other values (11) | 120291 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 451573 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 43236 | 9.6% |
| r | 42944 | 9.5% |
| p | 40356 | 8.9% |
| t | 36197 | 8.0% |
| i | 33076 | 7.3% |
| m | 29060 | 6.4% |
| g | 26780 | 5.9% |
| y | 26780 | 5.9% |
| v | 26780 | 5.9% |
| u | 26073 | 5.8% |
| Other values (11) | 120291 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 451573 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 43236 | 9.6% |
| r | 42944 | 9.5% |
| p | 40356 | 8.9% |
| t | 36197 | 8.0% |
| i | 33076 | 7.3% |
| m | 29060 | 6.4% |
| g | 26780 | 5.9% |
| y | 26780 | 5.9% |
| v | 26780 | 5.9% |
| u | 26073 | 5.8% |
| Other values (11) | 120291 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 451573 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 43236 | 9.6% |
| r | 42944 | 9.5% |
| p | 40356 | 8.9% |
| t | 36197 | 8.0% |
| i | 33076 | 7.3% |
| m | 29060 | 6.4% |
| g | 26780 | 5.9% |
| y | 26780 | 5.9% |
| v | 26780 | 5.9% |
| u | 26073 | 5.8% |
| Other values (11) | 120291 |
management
Categorical
High correlation 
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| vwc | |
|---|---|
| wug | |
| water board | 2933 |
| wua | 2535 |
| private operator | 1971 |
| Other values (7) |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.3506397 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | vwc |
|---|---|
| 2nd row | wug |
| 3rd row | vwc |
| 4th row | vwc |
| 5th row | other |
Common Values
| Value | Count | Frequency (%) |
| vwc | 40507 | |
| wug | 6515 | 11.0% |
| water board | 2933 | 4.9% |
| wua | 2535 | 4.3% |
| private operator | 1971 | 3.3% |
| parastatal | 1768 | 3.0% |
| water authority | 904 | 1.5% |
| other | 844 | 1.4% |
| company | 685 | 1.2% |
| unknown | 561 | 0.9% |
| Other values (2) | 177 | 0.3% |
Length
| Value | Count | Frequency (%) |
| vwc | 40507 | |
| wug | 6515 | 10.0% |
| water | 3837 | 5.9% |
| board | 2933 | 4.5% |
| wua | 2535 | 3.9% |
| private | 1971 | 3.0% |
| operator | 1971 | 3.0% |
| parastatal | 1768 | 2.7% |
| other | 943 | 1.4% |
| authority | 904 | 1.4% |
| Other values (5) | 1522 | 2.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| w | 53955 | |
| v | 42478 | |
| c | 41291 | |
| a | 21908 | |
| r | 16376 | 6.3% |
| t | 14222 | 5.5% |
| u | 10593 | 4.1% |
| o | 10166 | 3.9% |
| e | 8722 | 3.4% |
| g | 6515 | 2.5% |
| Other values (13) | 32202 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 258428 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| w | 53955 | |
| v | 42478 | |
| c | 41291 | |
| a | 21908 | |
| r | 16376 | 6.3% |
| t | 14222 | 5.5% |
| u | 10593 | 4.1% |
| o | 10166 | 3.9% |
| e | 8722 | 3.4% |
| g | 6515 | 2.5% |
| Other values (13) | 32202 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 258428 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| w | 53955 | |
| v | 42478 | |
| c | 41291 | |
| a | 21908 | |
| r | 16376 | 6.3% |
| t | 14222 | 5.5% |
| u | 10593 | 4.1% |
| o | 10166 | 3.9% |
| e | 8722 | 3.4% |
| g | 6515 | 2.5% |
| Other values (13) | 32202 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 258428 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| w | 53955 | |
| v | 42478 | |
| c | 41291 | |
| a | 21908 | |
| r | 16376 | 6.3% |
| t | 14222 | 5.5% |
| u | 10593 | 4.1% |
| o | 10166 | 3.9% |
| e | 8722 | 3.4% |
| g | 6515 | 2.5% |
| Other values (13) | 32202 |
management_group
Categorical
High correlation  Imbalance 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| user-group | |
|---|---|
| commercial | 3638 |
| parastatal | 1768 |
| other | 943 |
| unknown | 561 |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.8922896 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | user-group |
|---|---|
| 2nd row | user-group |
| 3rd row | user-group |
| 4th row | user-group |
| 5th row | other |
Common Values
| Value | Count | Frequency (%) |
| user-group | 52490 | |
| commercial | 3638 | 6.1% |
| parastatal | 1768 | 3.0% |
| other | 943 | 1.6% |
| unknown | 561 | 0.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| user-group | 52490 | |
| commercial | 3638 | 6.1% |
| parastatal | 1768 | 3.0% |
| other | 943 | 1.6% |
| unknown | 561 | 0.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 111329 | |
| u | 105541 | |
| o | 57632 | |
| e | 57071 | |
| s | 54258 | |
| p | 54258 | |
| - | 52490 | |
| g | 52490 | |
| a | 10710 | 1.8% |
| m | 7276 | 1.2% |
| Other values (8) | 24547 | 4.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 587602 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| r | 111329 | |
| u | 105541 | |
| o | 57632 | |
| e | 57071 | |
| s | 54258 | |
| p | 54258 | |
| - | 52490 | |
| g | 52490 | |
| a | 10710 | 1.8% |
| m | 7276 | 1.2% |
| Other values (8) | 24547 | 4.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 587602 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| r | 111329 | |
| u | 105541 | |
| o | 57632 | |
| e | 57071 | |
| s | 54258 | |
| p | 54258 | |
| - | 52490 | |
| g | 52490 | |
| a | 10710 | 1.8% |
| m | 7276 | 1.2% |
| Other values (8) | 24547 | 4.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 587602 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| r | 111329 | |
| u | 105541 | |
| o | 57632 | |
| e | 57071 | |
| s | 54258 | |
| p | 54258 | |
| - | 52490 | |
| g | 52490 | |
| a | 10710 | 1.8% |
| m | 7276 | 1.2% |
| Other values (8) | 24547 | 4.2% |
payment
Categorical
High correlation 
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| never pay | |
|---|---|
| pay per bucket | |
| pay monthly | |
| unknown | |
| pay when scheme fails | |
| Other values (2) |
Length
| Max length | 21 |
|---|---|
| Median length | 14 |
| Mean length | 10.664798 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | pay annually |
|---|---|
| 2nd row | never pay |
| 3rd row | pay per bucket |
| 4th row | never pay |
| 5th row | never pay |
Common Values
| Value | Count | Frequency (%) |
| never pay | 25348 | |
| pay per bucket | 8985 | 15.1% |
| pay monthly | 8300 | 14.0% |
| unknown | 8157 | 13.7% |
| pay when scheme fails | 3914 | 6.6% |
| pay annually | 3642 | 6.1% |
| other | 1054 | 1.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| pay | 50189 | |
| never | 25348 | |
| per | 8985 | 7.1% |
| bucket | 8985 | 7.1% |
| monthly | 8300 | 6.6% |
| unknown | 8157 | 6.5% |
| when | 3914 | 3.1% |
| scheme | 3914 | 3.1% |
| fails | 3914 | 3.1% |
| annually | 3642 | 2.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 81462 | |
| n | 69317 | |
| 67002 | ||
| y | 62131 | |
| a | 61387 | |
| p | 59174 | |
| r | 35387 | 5.6% |
| v | 25348 | 4.0% |
| u | 20784 | 3.3% |
| l | 19498 | 3.1% |
| Other values (11) | 131999 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 633489 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 81462 | |
| n | 69317 | |
| 67002 | ||
| y | 62131 | |
| a | 61387 | |
| p | 59174 | |
| r | 35387 | 5.6% |
| v | 25348 | 4.0% |
| u | 20784 | 3.3% |
| l | 19498 | 3.1% |
| Other values (11) | 131999 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 633489 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 81462 | |
| n | 69317 | |
| 67002 | ||
| y | 62131 | |
| a | 61387 | |
| p | 59174 | |
| r | 35387 | 5.6% |
| v | 25348 | 4.0% |
| u | 20784 | 3.3% |
| l | 19498 | 3.1% |
| Other values (11) | 131999 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 633489 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 81462 | |
| n | 69317 | |
| 67002 | ||
| y | 62131 | |
| a | 61387 | |
| p | 59174 | |
| r | 35387 | 5.6% |
| v | 25348 | 4.0% |
| u | 20784 | 3.3% |
| l | 19498 | 3.1% |
| Other values (11) | 131999 |
payment_type
Categorical
High correlation 
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| never pay | |
|---|---|
| per bucket | |
| monthly | |
| unknown | |
| on failure | |
| Other values (2) |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.5307576 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | annually |
|---|---|
| 2nd row | never pay |
| 3rd row | per bucket |
| 4th row | never pay |
| 5th row | never pay |
Common Values
| Value | Count | Frequency (%) |
| never pay | 25348 | |
| per bucket | 8985 | 15.1% |
| monthly | 8300 | 14.0% |
| unknown | 8157 | 13.7% |
| on failure | 3914 | 6.6% |
| annually | 3642 | 6.1% |
| other | 1054 | 1.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| never | 25348 | |
| pay | 25348 | |
| per | 8985 | 9.2% |
| bucket | 8985 | 9.2% |
| monthly | 8300 | 8.5% |
| unknown | 8157 | 8.4% |
| on | 3914 | 4.0% |
| failure | 3914 | 4.0% |
| annually | 3642 | 3.7% |
| other | 1054 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 73634 | |
| n | 69317 | |
| r | 39301 | 7.8% |
| 38247 | 7.5% | |
| y | 37290 | 7.4% |
| a | 36546 | 7.2% |
| p | 34333 | 6.8% |
| v | 25348 | 5.0% |
| u | 24698 | 4.9% |
| o | 21425 | 4.2% |
| Other values (10) | 106588 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 506727 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 73634 | |
| n | 69317 | |
| r | 39301 | 7.8% |
| 38247 | 7.5% | |
| y | 37290 | 7.4% |
| a | 36546 | 7.2% |
| p | 34333 | 6.8% |
| v | 25348 | 5.0% |
| u | 24698 | 4.9% |
| o | 21425 | 4.2% |
| Other values (10) | 106588 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 506727 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 73634 | |
| n | 69317 | |
| r | 39301 | 7.8% |
| 38247 | 7.5% | |
| y | 37290 | 7.4% |
| a | 36546 | 7.2% |
| p | 34333 | 6.8% |
| v | 25348 | 5.0% |
| u | 24698 | 4.9% |
| o | 21425 | 4.2% |
| Other values (10) | 106588 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 506727 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 73634 | |
| n | 69317 | |
| r | 39301 | 7.8% |
| 38247 | 7.5% | |
| y | 37290 | 7.4% |
| a | 36546 | 7.2% |
| p | 34333 | 6.8% |
| v | 25348 | 5.0% |
| u | 24698 | 4.9% |
| o | 21425 | 4.2% |
| Other values (10) | 106588 |
water_quality
Categorical
High correlation  Imbalance 
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| soft | |
|---|---|
| salty | 4856 |
| unknown | 1876 |
| milky | 804 |
| coloured | 490 |
| Other values (3) | 556 |
Length
| Max length | 18 |
|---|---|
| Median length | 4 |
| Mean length | 4.3032828 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | soft |
|---|---|
| 2nd row | soft |
| 3rd row | soft |
| 4th row | soft |
| 5th row | soft |
Common Values
| Value | Count | Frequency (%) |
| soft | 50818 | |
| salty | 4856 | 8.2% |
| unknown | 1876 | 3.2% |
| milky | 804 | 1.4% |
| coloured | 490 | 0.8% |
| salty abandoned | 339 | 0.6% |
| fluoride | 200 | 0.3% |
| fluoride abandoned | 17 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| soft | 50818 | |
| salty | 5195 | 8.7% |
| unknown | 1876 | 3.1% |
| milky | 804 | 1.3% |
| coloured | 490 | 0.8% |
| abandoned | 356 | 0.6% |
| fluoride | 217 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 56013 | |
| t | 56013 | |
| o | 54247 | |
| f | 51035 | |
| l | 6706 | 2.6% |
| n | 6340 | 2.5% |
| y | 5999 | 2.3% |
| a | 5907 | 2.3% |
| k | 2680 | 1.0% |
| u | 2583 | 1.0% |
| Other values (9) | 8092 | 3.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 255615 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| s | 56013 | |
| t | 56013 | |
| o | 54247 | |
| f | 51035 | |
| l | 6706 | 2.6% |
| n | 6340 | 2.5% |
| y | 5999 | 2.3% |
| a | 5907 | 2.3% |
| k | 2680 | 1.0% |
| u | 2583 | 1.0% |
| Other values (9) | 8092 | 3.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 255615 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| s | 56013 | |
| t | 56013 | |
| o | 54247 | |
| f | 51035 | |
| l | 6706 | 2.6% |
| n | 6340 | 2.5% |
| y | 5999 | 2.3% |
| a | 5907 | 2.3% |
| k | 2680 | 1.0% |
| u | 2583 | 1.0% |
| Other values (9) | 8092 | 3.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 255615 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| s | 56013 | |
| t | 56013 | |
| o | 54247 | |
| f | 51035 | |
| l | 6706 | 2.6% |
| n | 6340 | 2.5% |
| y | 5999 | 2.3% |
| a | 5907 | 2.3% |
| k | 2680 | 1.0% |
| u | 2583 | 1.0% |
| Other values (9) | 8092 | 3.2% |
quality_group
Categorical
High correlation  Imbalance 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| good | |
|---|---|
| salty | |
| unknown | 1876 |
| milky | 804 |
| colored | 490 |
Length
| Max length | 8 |
|---|---|
| Median length | 4 |
| Mean length | 4.235101 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | good |
|---|---|
| 2nd row | good |
| 3rd row | good |
| 4th row | good |
| 5th row | good |
Common Values
| Value | Count | Frequency (%) |
| good | 50818 | |
| salty | 5195 | 8.7% |
| unknown | 1876 | 3.2% |
| milky | 804 | 1.4% |
| colored | 490 | 0.8% |
| fluoride | 217 | 0.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| good | 50818 | |
| salty | 5195 | 8.7% |
| unknown | 1876 | 3.2% |
| milky | 804 | 1.4% |
| colored | 490 | 0.8% |
| fluoride | 217 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 104709 | |
| d | 51525 | |
| g | 50818 | |
| l | 6706 | 2.7% |
| y | 5999 | 2.4% |
| n | 5628 | 2.2% |
| t | 5195 | 2.1% |
| a | 5195 | 2.1% |
| s | 5195 | 2.1% |
| k | 2680 | 1.1% |
| Other values (8) | 7915 | 3.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 251565 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 104709 | |
| d | 51525 | |
| g | 50818 | |
| l | 6706 | 2.7% |
| y | 5999 | 2.4% |
| n | 5628 | 2.2% |
| t | 5195 | 2.1% |
| a | 5195 | 2.1% |
| s | 5195 | 2.1% |
| k | 2680 | 1.1% |
| Other values (8) | 7915 | 3.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 251565 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 104709 | |
| d | 51525 | |
| g | 50818 | |
| l | 6706 | 2.7% |
| y | 5999 | 2.4% |
| n | 5628 | 2.2% |
| t | 5195 | 2.1% |
| a | 5195 | 2.1% |
| s | 5195 | 2.1% |
| k | 2680 | 1.1% |
| Other values (8) | 7915 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 251565 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 104709 | |
| d | 51525 | |
| g | 50818 | |
| l | 6706 | 2.7% |
| y | 5999 | 2.4% |
| n | 5628 | 2.2% |
| t | 5195 | 2.1% |
| a | 5195 | 2.1% |
| s | 5195 | 2.1% |
| k | 2680 | 1.1% |
| Other values (8) | 7915 | 3.1% |
quantity
Categorical
High correlation 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| enough | |
|---|---|
| insufficient | |
| dry | |
| seasonal | |
| unknown | 789 |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.3623737 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | enough |
|---|---|
| 2nd row | insufficient |
| 3rd row | enough |
| 4th row | dry |
| 5th row | seasonal |
Common Values
| Value | Count | Frequency (%) |
| enough | 33186 | |
| insufficient | 15129 | |
| dry | 6246 | 10.5% |
| seasonal | 4050 | 6.8% |
| unknown | 789 | 1.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| enough | 33186 | |
| insufficient | 15129 | |
| dry | 6246 | 10.5% |
| seasonal | 4050 | 6.8% |
| unknown | 789 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| t | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 437325 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| t | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 437325 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| t | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 437325 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| t | 15129 | 3.5% |
| Other values (8) | 47595 |
quantity_group
Categorical
High correlation 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| enough | |
|---|---|
| insufficient | |
| dry | |
| seasonal | |
| unknown | 789 |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.3623737 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | enough |
|---|---|
| 2nd row | insufficient |
| 3rd row | enough |
| 4th row | dry |
| 5th row | seasonal |
Common Values
| Value | Count | Frequency (%) |
| enough | 33186 | |
| insufficient | 15129 | |
| dry | 6246 | 10.5% |
| seasonal | 4050 | 6.8% |
| unknown | 789 | 1.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| enough | 33186 | |
| insufficient | 15129 | |
| dry | 6246 | 10.5% |
| seasonal | 4050 | 6.8% |
| unknown | 789 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| t | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 437325 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| t | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 437325 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| t | 15129 | 3.5% |
| Other values (8) | 47595 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 437325 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 69861 | |
| e | 52365 | |
| u | 49104 | |
| i | 45387 | |
| o | 38025 | |
| g | 33186 | |
| h | 33186 | |
| f | 30258 | |
| s | 23229 | 5.3% |
| t | 15129 | 3.5% |
| Other values (8) | 47595 |
source
Categorical
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| spring | |
|---|---|
| shallow well | |
| machine dbh | |
| river | |
| rainwater harvesting | |
| Other values (5) |
Length
| Max length | 20 |
|---|---|
| Median length | 12 |
| Mean length | 8.9788047 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | spring |
|---|---|
| 2nd row | rainwater harvesting |
| 3rd row | dam |
| 4th row | machine dbh |
| 5th row | rainwater harvesting |
Common Values
| Value | Count | Frequency (%) |
| spring | 17021 | |
| shallow well | 16824 | |
| machine dbh | 11075 | |
| river | 9612 | |
| rainwater harvesting | 2295 | 3.9% |
| hand dtw | 874 | 1.5% |
| lake | 765 | 1.3% |
| dam | 656 | 1.1% |
| other | 212 | 0.4% |
| unknown | 66 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| spring | 17021 | |
| shallow | 16824 | |
| well | 16824 | |
| machine | 11075 | |
| dbh | 11075 | |
| river | 9612 | |
| rainwater | 2295 | 2.5% |
| harvesting | 2295 | 2.5% |
| hand | 874 | 1.0% |
| dtw | 874 | 1.0% |
| Other values (4) | 1699 | 1.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 68061 | |
| r | 43342 | 8.1% |
| e | 43078 | 8.1% |
| h | 42355 | 7.9% |
| i | 42298 | 7.9% |
| a | 37079 | 7.0% |
| w | 36883 | 6.9% |
| s | 36140 | 6.8% |
| n | 33758 | 6.3% |
| 31068 | 5.8% | |
| Other values (11) | 119279 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 533341 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| l | 68061 | |
| r | 43342 | 8.1% |
| e | 43078 | 8.1% |
| h | 42355 | 7.9% |
| i | 42298 | 7.9% |
| a | 37079 | 7.0% |
| w | 36883 | 6.9% |
| s | 36140 | 6.8% |
| n | 33758 | 6.3% |
| 31068 | 5.8% | |
| Other values (11) | 119279 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 533341 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| l | 68061 | |
| r | 43342 | 8.1% |
| e | 43078 | 8.1% |
| h | 42355 | 7.9% |
| i | 42298 | 7.9% |
| a | 37079 | 7.0% |
| w | 36883 | 6.9% |
| s | 36140 | 6.8% |
| n | 33758 | 6.3% |
| 31068 | 5.8% | |
| Other values (11) | 119279 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 533341 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| l | 68061 | |
| r | 43342 | 8.1% |
| e | 43078 | 8.1% |
| h | 42355 | 7.9% |
| i | 42298 | 7.9% |
| a | 37079 | 7.0% |
| w | 36883 | 6.9% |
| s | 36140 | 6.8% |
| n | 33758 | 6.3% |
| 31068 | 5.8% | |
| Other values (11) | 119279 |
source_type
Categorical
High correlation 
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| spring | |
|---|---|
| shallow well | |
| borehole | |
| river/lake | |
| rainwater harvesting | |
| Other values (2) | 934 |
Length
| Max length | 20 |
|---|---|
| Median length | 12 |
| Mean length | 9.3036027 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | spring |
|---|---|
| 2nd row | rainwater harvesting |
| 3rd row | dam |
| 4th row | borehole |
| 5th row | rainwater harvesting |
Common Values
| Value | Count | Frequency (%) |
| spring | 17021 | |
| shallow well | 16824 | |
| borehole | 11949 | |
| river/lake | 10377 | |
| rainwater harvesting | 2295 | 3.9% |
| dam | 656 | 1.1% |
| other | 278 | 0.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| spring | 17021 | |
| shallow | 16824 | |
| well | 16824 | |
| borehole | 11949 | |
| river/lake | 10377 | |
| rainwater | 2295 | 2.9% |
| harvesting | 2295 | 2.9% |
| dam | 656 | 0.8% |
| other | 278 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 89622 | |
| e | 66344 | |
| r | 56887 | |
| o | 41000 | 7.4% |
| s | 36140 | 6.5% |
| w | 35943 | 6.5% |
| a | 34742 | 6.3% |
| i | 31988 | 5.8% |
| h | 31346 | 5.7% |
| n | 21611 | 3.9% |
| Other values (10) | 107011 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 552634 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| l | 89622 | |
| e | 66344 | |
| r | 56887 | |
| o | 41000 | 7.4% |
| s | 36140 | 6.5% |
| w | 35943 | 6.5% |
| a | 34742 | 6.3% |
| i | 31988 | 5.8% |
| h | 31346 | 5.7% |
| n | 21611 | 3.9% |
| Other values (10) | 107011 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 552634 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| l | 89622 | |
| e | 66344 | |
| r | 56887 | |
| o | 41000 | 7.4% |
| s | 36140 | 6.5% |
| w | 35943 | 6.5% |
| a | 34742 | 6.3% |
| i | 31988 | 5.8% |
| h | 31346 | 5.7% |
| n | 21611 | 3.9% |
| Other values (10) | 107011 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 552634 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| l | 89622 | |
| e | 66344 | |
| r | 56887 | |
| o | 41000 | 7.4% |
| s | 36140 | 6.5% |
| w | 35943 | 6.5% |
| a | 34742 | 6.3% |
| i | 31988 | 5.8% |
| h | 31346 | 5.7% |
| n | 21611 | 3.9% |
| Other values (10) | 107011 |
source_class
Categorical
High correlation 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| groundwater | |
|---|---|
| surface | |
| unknown | 278 |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.083771 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | groundwater |
|---|---|
| 2nd row | surface |
| 3rd row | surface |
| 4th row | groundwater |
| 5th row | surface |
Common Values
| Value | Count | Frequency (%) |
| groundwater | 45794 | |
| surface | 13328 | 22.4% |
| unknown | 278 | 0.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| groundwater | 45794 | |
| surface | 13328 | 22.4% |
| unknown | 278 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 104916 | |
| u | 59400 | |
| a | 59122 | |
| e | 59122 | |
| n | 46628 | |
| o | 46072 | |
| w | 46072 | |
| g | 45794 | |
| d | 45794 | |
| t | 45794 | |
| Other values (4) | 40262 | 6.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 598976 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| r | 104916 | |
| u | 59400 | |
| a | 59122 | |
| e | 59122 | |
| n | 46628 | |
| o | 46072 | |
| w | 46072 | |
| g | 45794 | |
| d | 45794 | |
| t | 45794 | |
| Other values (4) | 40262 | 6.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 598976 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| r | 104916 | |
| u | 59400 | |
| a | 59122 | |
| e | 59122 | |
| n | 46628 | |
| o | 46072 | |
| w | 46072 | |
| g | 45794 | |
| d | 45794 | |
| t | 45794 | |
| Other values (4) | 40262 | 6.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 598976 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| r | 104916 | |
| u | 59400 | |
| a | 59122 | |
| e | 59122 | |
| n | 46628 | |
| o | 46072 | |
| w | 46072 | |
| g | 45794 | |
| d | 45794 | |
| t | 45794 | |
| Other values (4) | 40262 | 6.7% |
waterpoint_type
Categorical
High correlation 
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| communal standpipe | |
|---|---|
| hand pump | |
| other | |
| communal standpipe multiple | |
| improved spring | 784 |
| Other values (2) | 123 |
Length
| Max length | 27 |
|---|---|
| Median length | 18 |
| Mean length | 14.827576 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | communal standpipe |
|---|---|
| 2nd row | communal standpipe |
| 3rd row | communal standpipe multiple |
| 4th row | communal standpipe multiple |
| 5th row | communal standpipe |
Common Values
| Value | Count | Frequency (%) |
| communal standpipe | 28522 | |
| hand pump | 17488 | |
| other | 6380 | 10.7% |
| communal standpipe multiple | 6103 | 10.3% |
| improved spring | 784 | 1.3% |
| cattle trough | 116 | 0.2% |
| dam | 7 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| communal | 34625 | |
| standpipe | 34625 | |
| hand | 17488 | |
| pump | 17488 | |
| other | 6380 | 5.4% |
| multiple | 6103 | 5.1% |
| improved | 784 | 0.7% |
| spring | 784 | 0.7% |
| cattle | 116 | 0.1% |
| trough | 116 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| p | 111897 | |
| m | 93632 | |
| n | 87522 | |
| a | 86861 | |
| 59116 | 6.7% | |
| u | 58332 | 6.6% |
| d | 52904 | 6.0% |
| e | 48008 | 5.5% |
| t | 47456 | 5.4% |
| l | 46947 | 5.3% |
| Other values (8) | 188083 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 880758 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| p | 111897 | |
| m | 93632 | |
| n | 87522 | |
| a | 86861 | |
| 59116 | 6.7% | |
| u | 58332 | 6.6% |
| d | 52904 | 6.0% |
| e | 48008 | 5.5% |
| t | 47456 | 5.4% |
| l | 46947 | 5.3% |
| Other values (8) | 188083 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 880758 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| p | 111897 | |
| m | 93632 | |
| n | 87522 | |
| a | 86861 | |
| 59116 | 6.7% | |
| u | 58332 | 6.6% |
| d | 52904 | 6.0% |
| e | 48008 | 5.5% |
| t | 47456 | 5.4% |
| l | 46947 | 5.3% |
| Other values (8) | 188083 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 880758 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| p | 111897 | |
| m | 93632 | |
| n | 87522 | |
| a | 86861 | |
| 59116 | 6.7% | |
| u | 58332 | 6.6% |
| d | 52904 | 6.0% |
| e | 48008 | 5.5% |
| t | 47456 | 5.4% |
| l | 46947 | 5.3% |
| Other values (8) | 188083 |
waterpoint_type_group
Categorical
High correlation 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 464.2 KiB |
| communal standpipe | |
|---|---|
| hand pump | |
| other | |
| improved spring | 784 |
| cattle trough | 116 |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 13.902879 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | communal standpipe |
|---|---|
| 2nd row | communal standpipe |
| 3rd row | communal standpipe |
| 4th row | communal standpipe |
| 5th row | communal standpipe |
Common Values
| Value | Count | Frequency (%) |
| communal standpipe | 34625 | |
| hand pump | 17488 | |
| other | 6380 | 10.7% |
| improved spring | 784 | 1.3% |
| cattle trough | 116 | 0.2% |
| dam | 7 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| communal | 34625 | |
| standpipe | 34625 | |
| hand | 17488 | |
| pump | 17488 | |
| other | 6380 | 5.7% |
| improved | 784 | 0.7% |
| spring | 784 | 0.7% |
| cattle | 116 | 0.1% |
| trough | 116 | 0.1% |
| dam | 7 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| p | 105794 | |
| m | 87529 | |
| n | 87522 | |
| a | 86861 | |
| 53013 | 6.4% | |
| d | 52904 | 6.4% |
| u | 52229 | 6.3% |
| e | 41905 | 5.1% |
| o | 41905 | 5.1% |
| t | 41353 | 5.0% |
| Other values (8) | 174816 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 825831 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| p | 105794 | |
| m | 87529 | |
| n | 87522 | |
| a | 86861 | |
| 53013 | 6.4% | |
| d | 52904 | 6.4% |
| u | 52229 | 6.3% |
| e | 41905 | 5.1% |
| o | 41905 | 5.1% |
| t | 41353 | 5.0% |
| Other values (8) | 174816 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 825831 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| p | 105794 | |
| m | 87529 | |
| n | 87522 | |
| a | 86861 | |
| 53013 | 6.4% | |
| d | 52904 | 6.4% |
| u | 52229 | 6.3% |
| e | 41905 | 5.1% |
| o | 41905 | 5.1% |
| t | 41353 | 5.0% |
| Other values (8) | 174816 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 825831 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| p | 105794 | |
| m | 87529 | |
| n | 87522 | |
| a | 86861 | |
| 53013 | 6.4% | |
| d | 52904 | 6.4% |
| u | 52229 | 6.3% |
| e | 41905 | 5.1% |
| o | 41905 | 5.1% |
| t | 41353 | 5.0% |
| Other values (8) | 174816 |
Interactions
Correlations
| amount_tsh | basin | construction_year | district_code | extraction_type | extraction_type_class | extraction_type_group | gps_height | id | latitude | longitude | management | management_group | num_private | payment | payment_type | permit | population | public_meeting | quality_group | quantity | quantity_group | region | region_code | scheme_management | source | source_class | source_type | water_quality | waterpoint_type | waterpoint_type_group | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| amount_tsh | 1.000 | 0.010 | 0.408 | -0.089 | 0.014 | 0.012 | 0.007 | 0.342 | -0.005 | -0.262 | 0.209 | 0.024 | 0.026 | 0.032 | 0.016 | 0.016 | 0.000 | 0.343 | 0.013 | 0.000 | 0.000 | 0.000 | 0.014 | -0.150 | 0.019 | 0.007 | 0.017 | 0.010 | 0.000 | 0.000 | 0.000 |
| basin | 0.010 | 1.000 | 0.525 | 0.228 | 0.246 | 0.250 | 0.239 | 0.280 | 0.000 | 0.606 | 0.647 | 0.222 | 0.142 | 0.007 | 0.245 | 0.245 | 0.200 | 0.024 | 0.114 | 0.139 | 0.139 | 0.139 | 0.767 | 0.475 | 0.240 | 0.245 | 0.123 | 0.255 | 0.119 | 0.208 | 0.196 |
| construction_year | 0.408 | 0.525 | 1.000 | -0.063 | 0.318 | 0.269 | 0.314 | 0.612 | -0.003 | -0.192 | 0.479 | 0.287 | 0.081 | 0.050 | 0.267 | 0.267 | 0.082 | 0.678 | 0.025 | 0.126 | 0.129 | 0.129 | 0.945 | -0.209 | 0.289 | 0.282 | 0.098 | 0.275 | 0.130 | 0.232 | 0.230 |
| district_code | -0.089 | 0.228 | -0.063 | 1.000 | 0.115 | 0.100 | 0.107 | -0.136 | 0.000 | -0.134 | 0.131 | 0.066 | 0.049 | 0.003 | 0.089 | 0.089 | 0.186 | -0.074 | 0.174 | 0.067 | 0.074 | 0.074 | 0.385 | 0.119 | 0.075 | 0.070 | 0.070 | 0.083 | 0.058 | 0.073 | 0.069 |
| extraction_type | 0.014 | 0.246 | 0.318 | 0.115 | 1.000 | 1.000 | 1.000 | 0.160 | 0.000 | 0.195 | 0.226 | 0.166 | 0.128 | 0.000 | 0.244 | 0.244 | 0.212 | 0.029 | 0.137 | 0.173 | 0.122 | 0.122 | 0.261 | 0.169 | 0.192 | 0.408 | 0.309 | 0.483 | 0.148 | 0.505 | 0.537 |
| extraction_type_class | 0.012 | 0.250 | 0.269 | 0.100 | 1.000 | 1.000 | 1.000 | 0.168 | 0.000 | 0.196 | 0.196 | 0.200 | 0.118 | 0.009 | 0.231 | 0.231 | 0.171 | 0.027 | 0.100 | 0.160 | 0.108 | 0.108 | 0.358 | 0.139 | 0.207 | 0.455 | 0.271 | 0.440 | 0.147 | 0.504 | 0.536 |
| extraction_type_group | 0.007 | 0.239 | 0.314 | 0.107 | 1.000 | 1.000 | 1.000 | 0.156 | 0.000 | 0.186 | 0.217 | 0.157 | 0.121 | 0.000 | 0.242 | 0.242 | 0.197 | 0.027 | 0.106 | 0.170 | 0.116 | 0.116 | 0.291 | 0.164 | 0.171 | 0.393 | 0.271 | 0.465 | 0.145 | 0.504 | 0.536 |
| gps_height | 0.342 | 0.280 | 0.612 | -0.136 | 0.160 | 0.168 | 0.156 | 1.000 | -0.005 | -0.087 | 0.160 | 0.142 | 0.054 | 0.043 | 0.167 | 0.167 | 0.190 | 0.547 | 0.079 | 0.098 | 0.100 | 0.100 | 0.466 | -0.202 | 0.134 | 0.154 | 0.087 | 0.177 | 0.084 | 0.122 | 0.121 |
| id | -0.005 | 0.000 | -0.003 | 0.000 | 0.000 | 0.000 | 0.000 | -0.005 | 1.000 | 0.003 | 0.001 | 0.000 | 0.000 | 0.004 | 0.001 | 0.001 | 0.000 | 0.003 | 0.010 | 0.000 | 0.000 | 0.000 | 0.004 | 0.001 | 0.000 | 0.002 | 0.007 | 0.006 | 0.000 | 0.008 | 0.008 |
| latitude | -0.262 | 0.606 | -0.192 | -0.134 | 0.195 | 0.196 | 0.186 | -0.087 | 0.003 | 1.000 | -0.362 | 0.201 | 0.143 | -0.009 | 0.210 | 0.210 | 0.196 | -0.141 | 0.061 | 0.120 | 0.139 | 0.139 | 0.675 | 0.192 | 0.230 | 0.183 | 0.123 | 0.189 | 0.113 | 0.145 | 0.133 |
| longitude | 0.209 | 0.647 | 0.479 | 0.131 | 0.226 | 0.196 | 0.217 | 0.160 | 0.001 | -0.362 | 1.000 | 0.237 | 0.106 | 0.136 | 0.199 | 0.199 | 0.120 | 0.398 | 0.064 | 0.103 | 0.101 | 0.101 | 0.793 | -0.458 | 0.294 | 0.169 | 0.064 | 0.140 | 0.129 | 0.193 | 0.192 |
| management | 0.024 | 0.222 | 0.287 | 0.066 | 0.166 | 0.200 | 0.157 | 0.142 | 0.000 | 0.201 | 0.237 | 1.000 | 1.000 | 0.033 | 0.226 | 0.226 | 0.240 | 0.032 | 0.292 | 0.157 | 0.239 | 0.239 | 0.343 | 0.138 | 0.794 | 0.216 | 0.201 | 0.259 | 0.139 | 0.155 | 0.166 |
| management_group | 0.026 | 0.142 | 0.081 | 0.049 | 0.128 | 0.118 | 0.121 | 0.054 | 0.000 | 0.143 | 0.106 | 1.000 | 1.000 | 0.020 | 0.145 | 0.145 | 0.044 | 0.031 | 0.250 | 0.138 | 0.228 | 0.228 | 0.222 | 0.084 | 0.700 | 0.225 | 0.136 | 0.219 | 0.139 | 0.070 | 0.067 |
| num_private | 0.032 | 0.007 | 0.050 | 0.003 | 0.000 | 0.009 | 0.000 | 0.043 | 0.004 | -0.009 | 0.136 | 0.033 | 0.020 | 1.000 | 0.007 | 0.007 | 0.007 | 0.033 | 0.000 | 0.011 | 0.000 | 0.000 | 0.008 | -0.093 | 0.016 | 0.000 | 0.000 | 0.000 | 0.009 | 0.000 | 0.000 |
| payment | 0.016 | 0.245 | 0.267 | 0.089 | 0.244 | 0.231 | 0.242 | 0.167 | 0.001 | 0.210 | 0.199 | 0.226 | 0.145 | 0.007 | 1.000 | 1.000 | 0.185 | 0.019 | 0.144 | 0.143 | 0.127 | 0.127 | 0.357 | 0.153 | 0.203 | 0.204 | 0.098 | 0.190 | 0.133 | 0.163 | 0.163 |
| payment_type | 0.016 | 0.245 | 0.267 | 0.089 | 0.244 | 0.231 | 0.242 | 0.167 | 0.001 | 0.210 | 0.199 | 0.226 | 0.145 | 0.007 | 1.000 | 1.000 | 0.185 | 0.019 | 0.144 | 0.143 | 0.127 | 0.127 | 0.357 | 0.153 | 0.203 | 0.204 | 0.098 | 0.190 | 0.133 | 0.163 | 0.163 |
| permit | 0.000 | 0.200 | 0.082 | 0.186 | 0.212 | 0.171 | 0.197 | 0.190 | 0.000 | 0.196 | 0.120 | 0.240 | 0.044 | 0.007 | 0.185 | 0.185 | 1.000 | 0.035 | 0.137 | 0.119 | 0.056 | 0.056 | 0.408 | 0.159 | 0.293 | 0.221 | 0.114 | 0.219 | 0.134 | 0.155 | 0.147 |
| population | 0.343 | 0.024 | 0.678 | -0.074 | 0.029 | 0.027 | 0.027 | 0.547 | 0.003 | -0.141 | 0.398 | 0.032 | 0.031 | 0.033 | 0.019 | 0.019 | 0.035 | 1.000 | 0.032 | 0.004 | 0.009 | 0.009 | 0.050 | -0.093 | 0.044 | 0.021 | 0.000 | 0.020 | 0.000 | 0.029 | 0.028 |
| public_meeting | 0.013 | 0.114 | 0.025 | 0.174 | 0.137 | 0.100 | 0.106 | 0.079 | 0.010 | 0.061 | 0.064 | 0.292 | 0.250 | 0.000 | 0.144 | 0.144 | 0.137 | 0.032 | 1.000 | 0.060 | 0.104 | 0.104 | 0.258 | 0.120 | 0.268 | 0.109 | 0.059 | 0.096 | 0.060 | 0.094 | 0.093 |
| quality_group | 0.000 | 0.139 | 0.126 | 0.067 | 0.173 | 0.160 | 0.170 | 0.098 | 0.000 | 0.120 | 0.103 | 0.157 | 0.138 | 0.011 | 0.143 | 0.143 | 0.119 | 0.004 | 0.060 | 1.000 | 0.279 | 0.279 | 0.215 | 0.079 | 0.082 | 0.179 | 0.135 | 0.174 | 1.000 | 0.138 | 0.131 |
| quantity | 0.000 | 0.139 | 0.129 | 0.074 | 0.122 | 0.108 | 0.116 | 0.100 | 0.000 | 0.139 | 0.101 | 0.239 | 0.228 | 0.000 | 0.127 | 0.127 | 0.056 | 0.009 | 0.104 | 0.279 | 1.000 | 1.000 | 0.212 | 0.090 | 0.148 | 0.205 | 0.137 | 0.199 | 0.280 | 0.092 | 0.084 |
| quantity_group | 0.000 | 0.139 | 0.129 | 0.074 | 0.122 | 0.108 | 0.116 | 0.100 | 0.000 | 0.139 | 0.101 | 0.239 | 0.228 | 0.000 | 0.127 | 0.127 | 0.056 | 0.009 | 0.104 | 0.279 | 1.000 | 1.000 | 0.212 | 0.090 | 0.148 | 0.205 | 0.137 | 0.199 | 0.280 | 0.092 | 0.084 |
| region | 0.014 | 0.767 | 0.945 | 0.385 | 0.261 | 0.358 | 0.291 | 0.466 | 0.004 | 0.675 | 0.793 | 0.343 | 0.222 | 0.008 | 0.357 | 0.357 | 0.408 | 0.050 | 0.258 | 0.215 | 0.212 | 0.212 | 1.000 | 0.790 | 0.381 | 0.322 | 0.220 | 0.356 | 0.198 | 0.294 | 0.271 |
| region_code | -0.150 | 0.475 | -0.209 | 0.119 | 0.169 | 0.139 | 0.164 | -0.202 | 0.001 | 0.192 | -0.458 | 0.138 | 0.084 | -0.093 | 0.153 | 0.153 | 0.159 | -0.093 | 0.120 | 0.079 | 0.090 | 0.090 | 0.790 | 1.000 | 0.170 | 0.126 | 0.082 | 0.109 | 0.074 | 0.127 | 0.122 |
| scheme_management | 0.019 | 0.240 | 0.289 | 0.075 | 0.192 | 0.207 | 0.171 | 0.134 | 0.000 | 0.230 | 0.294 | 0.794 | 0.700 | 0.016 | 0.203 | 0.203 | 0.293 | 0.044 | 0.268 | 0.082 | 0.148 | 0.148 | 0.381 | 0.170 | 1.000 | 0.224 | 0.218 | 0.269 | 0.085 | 0.171 | 0.182 |
| source | 0.007 | 0.245 | 0.282 | 0.070 | 0.408 | 0.455 | 0.393 | 0.154 | 0.002 | 0.183 | 0.169 | 0.216 | 0.225 | 0.000 | 0.204 | 0.204 | 0.221 | 0.021 | 0.109 | 0.179 | 0.205 | 0.205 | 0.322 | 0.126 | 0.224 | 1.000 | 1.000 | 1.000 | 0.152 | 0.379 | 0.385 |
| source_class | 0.017 | 0.123 | 0.098 | 0.070 | 0.309 | 0.271 | 0.271 | 0.087 | 0.007 | 0.123 | 0.064 | 0.201 | 0.136 | 0.000 | 0.098 | 0.098 | 0.114 | 0.000 | 0.059 | 0.135 | 0.137 | 0.137 | 0.220 | 0.082 | 0.218 | 1.000 | 1.000 | 1.000 | 0.135 | 0.284 | 0.284 |
| source_type | 0.010 | 0.255 | 0.275 | 0.083 | 0.483 | 0.440 | 0.465 | 0.177 | 0.006 | 0.189 | 0.140 | 0.259 | 0.219 | 0.000 | 0.190 | 0.190 | 0.219 | 0.020 | 0.096 | 0.174 | 0.199 | 0.199 | 0.356 | 0.109 | 0.269 | 1.000 | 1.000 | 1.000 | 0.159 | 0.367 | 0.380 |
| water_quality | 0.000 | 0.119 | 0.130 | 0.058 | 0.148 | 0.147 | 0.145 | 0.084 | 0.000 | 0.113 | 0.129 | 0.139 | 0.139 | 0.009 | 0.133 | 0.133 | 0.134 | 0.000 | 0.060 | 1.000 | 0.280 | 0.280 | 0.198 | 0.074 | 0.085 | 0.152 | 0.135 | 0.159 | 1.000 | 0.127 | 0.132 |
| waterpoint_type | 0.000 | 0.208 | 0.232 | 0.073 | 0.505 | 0.504 | 0.504 | 0.122 | 0.008 | 0.145 | 0.193 | 0.155 | 0.070 | 0.000 | 0.163 | 0.163 | 0.155 | 0.029 | 0.094 | 0.138 | 0.092 | 0.092 | 0.294 | 0.127 | 0.171 | 0.379 | 0.284 | 0.367 | 0.127 | 1.000 | 1.000 |
| waterpoint_type_group | 0.000 | 0.196 | 0.230 | 0.069 | 0.537 | 0.536 | 0.536 | 0.121 | 0.008 | 0.133 | 0.192 | 0.166 | 0.067 | 0.000 | 0.163 | 0.163 | 0.147 | 0.028 | 0.093 | 0.131 | 0.084 | 0.084 | 0.271 | 0.122 | 0.182 | 0.385 | 0.284 | 0.380 | 0.132 | 1.000 | 1.000 |
Missing values
Sample
| id | amount_tsh | date_recorded | funder | gps_height | installer | longitude | latitude | wpt_name | num_private | basin | subvillage | region | region_code | district_code | lga | ward | population | public_meeting | recorded_by | scheme_management | scheme_name | permit | construction_year | extraction_type | extraction_type_group | extraction_type_class | management | management_group | payment | payment_type | water_quality | quality_group | quantity | quantity_group | source | source_type | source_class | waterpoint_type | waterpoint_type_group | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 69572 | 6000.0 | 2011-03-14 | Roman | 1390 | Roman | 34.938093 | -9.856322 | none | 0 | Lake Nyasa | Mnyusi B | Iringa | 11 | 5 | Ludewa | Mundindi | 109 | True | GeoData Consultants Ltd | VWC | Roman | False | 1999 | gravity | gravity | gravity | vwc | user-group | pay annually | annually | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe |
| 1 | 8776 | 0.0 | 2013-03-06 | Grumeti | 1399 | GRUMETI | 34.698766 | -2.147466 | Zahanati | 0 | Lake Victoria | Nyamara | Mara | 20 | 2 | Serengeti | Natta | 280 | NaN | GeoData Consultants Ltd | Other | NaN | True | 2010 | gravity | gravity | gravity | wug | user-group | never pay | never pay | soft | good | insufficient | insufficient | rainwater harvesting | rainwater harvesting | surface | communal standpipe | communal standpipe |
| 2 | 34310 | 25.0 | 2013-02-25 | Lottery Club | 686 | World vision | 37.460664 | -3.821329 | Kwa Mahundi | 0 | Pangani | Majengo | Manyara | 21 | 4 | Simanjiro | Ngorika | 250 | True | GeoData Consultants Ltd | VWC | Nyumba ya mungu pipe scheme | True | 2009 | gravity | gravity | gravity | vwc | user-group | pay per bucket | per bucket | soft | good | enough | enough | dam | dam | surface | communal standpipe multiple | communal standpipe |
| 3 | 67743 | 0.0 | 2013-01-28 | Unicef | 263 | UNICEF | 38.486161 | -11.155298 | Zahanati Ya Nanyumbu | 0 | Ruvuma / Southern Coast | Mahakamani | Mtwara | 90 | 63 | Nanyumbu | Nanyumbu | 58 | True | GeoData Consultants Ltd | VWC | NaN | True | 1986 | submersible | submersible | submersible | vwc | user-group | never pay | never pay | soft | good | dry | dry | machine dbh | borehole | groundwater | communal standpipe multiple | communal standpipe |
| 4 | 19728 | 0.0 | 2011-07-13 | Action In A | 0 | Artisan | 31.130847 | -1.825359 | Shuleni | 0 | Lake Victoria | Kyanyamisa | Kagera | 18 | 1 | Karagwe | Nyakasimbi | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | gravity | gravity | gravity | other | other | never pay | never pay | soft | good | seasonal | seasonal | rainwater harvesting | rainwater harvesting | surface | communal standpipe | communal standpipe |
| 5 | 9944 | 20.0 | 2011-03-13 | Mkinga Distric Coun | 0 | DWE | 39.172796 | -4.765587 | Tajiri | 0 | Pangani | Moa/Mwereme | Tanga | 4 | 8 | Mkinga | Moa | 1 | True | GeoData Consultants Ltd | VWC | Zingibali | True | 2009 | submersible | submersible | submersible | vwc | user-group | pay per bucket | per bucket | salty | salty | enough | enough | other | other | unknown | communal standpipe multiple | communal standpipe |
| 6 | 19816 | 0.0 | 2012-10-01 | Dwsp | 0 | DWSP | 33.362410 | -3.766365 | Kwa Ngomho | 0 | Internal | Ishinabulandi | Shinyanga | 17 | 3 | Shinyanga Rural | Samuye | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | swn 80 | swn 80 | handpump | vwc | user-group | never pay | never pay | soft | good | enough | enough | machine dbh | borehole | groundwater | hand pump | hand pump |
| 7 | 54551 | 0.0 | 2012-10-09 | Rwssp | 0 | DWE | 32.620617 | -4.226198 | Tushirikiane | 0 | Lake Tanganyika | Nyawishi Center | Shinyanga | 17 | 3 | Kahama | Chambo | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | nira/tanira | nira/tanira | handpump | wug | user-group | unknown | unknown | milky | milky | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump |
| 8 | 53934 | 0.0 | 2012-11-03 | Wateraid | 0 | Water Aid | 32.711100 | -5.146712 | Kwa Ramadhan Musa | 0 | Lake Tanganyika | Imalauduki | Tabora | 14 | 6 | Tabora Urban | Itetemia | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | india mark ii | india mark ii | handpump | vwc | user-group | never pay | never pay | salty | salty | seasonal | seasonal | machine dbh | borehole | groundwater | hand pump | hand pump |
| 9 | 46144 | 0.0 | 2011-08-03 | Isingiro Ho | 0 | Artisan | 30.626991 | -1.257051 | Kwapeto | 0 | Lake Victoria | Mkonomre | Kagera | 18 | 1 | Karagwe | Kaisho | 0 | True | GeoData Consultants Ltd | NaN | NaN | True | 0 | nira/tanira | nira/tanira | handpump | vwc | user-group | never pay | never pay | soft | good | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump |
| id | amount_tsh | date_recorded | funder | gps_height | installer | longitude | latitude | wpt_name | num_private | basin | subvillage | region | region_code | district_code | lga | ward | population | public_meeting | recorded_by | scheme_management | scheme_name | permit | construction_year | extraction_type | extraction_type_group | extraction_type_class | management | management_group | payment | payment_type | water_quality | quality_group | quantity | quantity_group | source | source_type | source_class | waterpoint_type | waterpoint_type_group | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 59390 | 13677 | 0.0 | 2011-08-04 | Rudep | 1715 | DWE | 31.370848 | -8.258160 | Kwa Mzee Atanas | 0 | Lake Tanganyika | Kitonto | Rukwa | 15 | 2 | Sumbawanga Rural | Mkowe | 150 | True | GeoData Consultants Ltd | VWC | NaN | False | 1991 | swn 80 | swn 80 | handpump | vwc | user-group | never pay | never pay | soft | good | insufficient | insufficient | machine dbh | borehole | groundwater | hand pump | hand pump |
| 59391 | 44885 | 0.0 | 2013-08-03 | Government Of Tanzania | 540 | Government | 38.044070 | -4.272218 | Kwa | 0 | Pangani | Maore Kati | Kilimanjaro | 3 | 3 | Same | Maore | 210 | True | GeoData Consultants Ltd | Water authority | Hingilili | True | 1967 | gravity | gravity | gravity | vwc | user-group | never pay | never pay | soft | good | enough | enough | river | river/lake | surface | communal standpipe | communal standpipe |
| 59392 | 40607 | 0.0 | 2011-04-15 | Government Of Tanzania | 0 | Government | 33.009440 | -8.520888 | Benard Charles | 0 | Lake Rukwa | Mbuyuni A | Mbeya | 12 | 1 | Chunya | Mbuyuni | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | gravity | gravity | gravity | vwc | user-group | never pay | never pay | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe |
| 59393 | 48348 | 0.0 | 2012-10-27 | Private | 0 | Private | 33.866852 | -4.287410 | Kwa Peter | 0 | Internal | Masanga | Tabora | 14 | 2 | Igunga | Igunga | 0 | False | GeoData Consultants Ltd | Water authority | NaN | False | 0 | gravity | gravity | gravity | private operator | commercial | pay per bucket | per bucket | soft | good | insufficient | insufficient | dam | dam | surface | other | other |
| 59394 | 11164 | 500.0 | 2011-03-09 | World Bank | 351 | ML appro | 37.634053 | -6.124830 | Chimeredya | 0 | Wami / Ruvu | Komstari | Morogoro | 5 | 6 | Mvomero | Diongoya | 89 | True | GeoData Consultants Ltd | VWC | NaN | True | 2007 | submersible | submersible | submersible | vwc | user-group | pay monthly | monthly | soft | good | enough | enough | machine dbh | borehole | groundwater | communal standpipe | communal standpipe |
| 59395 | 60739 | 10.0 | 2013-05-03 | Germany Republi | 1210 | CES | 37.169807 | -3.253847 | Area Three Namba 27 | 0 | Pangani | Kiduruni | Kilimanjaro | 3 | 5 | Hai | Masama Magharibi | 125 | True | GeoData Consultants Ltd | Water Board | Losaa Kia water supply | True | 1999 | gravity | gravity | gravity | water board | user-group | pay per bucket | per bucket | soft | good | enough | enough | spring | spring | groundwater | communal standpipe | communal standpipe |
| 59396 | 27263 | 4700.0 | 2011-05-07 | Cefa-njombe | 1212 | Cefa | 35.249991 | -9.070629 | Kwa Yahona Kuvala | 0 | Rufiji | Igumbilo | Iringa | 11 | 4 | Njombe | Ikondo | 56 | True | GeoData Consultants Ltd | VWC | Ikondo electrical water sch | True | 1996 | gravity | gravity | gravity | vwc | user-group | pay annually | annually | soft | good | enough | enough | river | river/lake | surface | communal standpipe | communal standpipe |
| 59397 | 37057 | 0.0 | 2011-04-11 | NaN | 0 | NaN | 34.017087 | -8.750434 | Mashine | 0 | Rufiji | Madungulu | Mbeya | 12 | 7 | Mbarali | Chimala | 0 | True | GeoData Consultants Ltd | VWC | NaN | False | 0 | swn 80 | swn 80 | handpump | vwc | user-group | pay monthly | monthly | fluoride | fluoride | enough | enough | machine dbh | borehole | groundwater | hand pump | hand pump |
| 59398 | 31282 | 0.0 | 2011-03-08 | Malec | 0 | Musa | 35.861315 | -6.378573 | Mshoro | 0 | Rufiji | Mwinyi | Dodoma | 1 | 4 | Chamwino | Mvumi Makulu | 0 | True | GeoData Consultants Ltd | VWC | NaN | True | 0 | nira/tanira | nira/tanira | handpump | vwc | user-group | never pay | never pay | soft | good | insufficient | insufficient | shallow well | shallow well | groundwater | hand pump | hand pump |
| 59399 | 26348 | 0.0 | 2011-03-23 | World Bank | 191 | World | 38.104048 | -6.747464 | Kwa Mzee Lugawa | 0 | Wami / Ruvu | Kikatanyemba | Morogoro | 5 | 2 | Morogoro Rural | Ngerengere | 150 | True | GeoData Consultants Ltd | VWC | NaN | True | 2002 | nira/tanira | nira/tanira | handpump | vwc | user-group | pay when scheme fails | on failure | salty | salty | enough | enough | shallow well | shallow well | groundwater | hand pump | hand pump |